gr-gifted-amateur6-1-f6034e4a-6d74-45e8-a6a2-e8ca51ff5287

46

Gravitational waves

46.1 Waves in a gauge theory 494 46.2 Lorenz gauge for gravitational waves 496
46.3 Quadrupolar radiation 501 46.4 Radiated energy and power
46.5 An exact solution 505 46.6 The discovery of gravitational waves 506 506 quad506\quad 506506
Chapter summary 509
Exercises 509
1 1 ^(1){ }^{1}1 LIGO stands for Laser Interferometer Gravitational-Wave Observatory.
2 2 ^(2){ }^{2}2 In the presence of sources, the equation is
ν ( μ A μ ) + 2 A ν = J ν ν μ A μ + 2 A ν = J ν -del_(nu)(del_(mu)A^(mu))+del^(2)A_(nu)=-J_(nu)-\partial_{\nu}\left(\partial_{\mu} A^{\mu}\right)+\partial^{2} A_{\nu}=-J_{\nu}ν(μAμ)+2Aν=Jν
Like as the waves make towards the pebbled shore, So do our minutes hasten to their end
William Shakespeare (1564-1616) Sonnet 60
In this chapter, we discuss the waves that can propagate as excitations of the gravitational field. These waves were predicted by Einstein in 1906 on the basis of general relativity (they had been suggested previously by Henri Poincaré) and have been the subject of two relatively recent Nobel prizes: the 1993 award to Hulse and Taylor whose work on binary pulsars offered indirect evidence for the waves, and the 2017 prize for Weiss, Thorne and Barish. The latter was awarded in the wake of the direct observation (by the LIGO 1 LIGO 1 LIGO^(1)\mathrm{LIGO}^{1}LIGO1 collaboration) of gravitational waves that resulted from the merger of two black holes. We describe the LIGO experiment at the end of the chapter. We start, however, with a review of electromagnetic waves, before repeating the argument for waves in a weak gravitational field.

46.1 Waves in a gauge theory

We saw in Chapter 42 that, in flat spacetime with no sources, an equation of motion for the electromagnetic field can be written as 2 2 ^(2){ }^{2}2
(46.1) ν ( μ A μ ) + 2 A ν = 0 (46.1) ν μ A μ + 2 A ν = 0 {:(46.1)-del_(nu)(del_(mu)A^(mu))+del^(2)A_(nu)=0:}\begin{equation*} -\partial_{\nu}\left(\partial_{\mu} A^{\mu}\right)+\partial^{2} A_{\nu}=0 \tag{46.1} \end{equation*}(46.1)ν(μAμ)+2Aν=0
This equation tells us that the electromagnetic field has dynamics of its own, independent of the presence of electric charges. These dynamics are wave-like excitations of the field.
As usual, we're free to make changes to A ~ A ~ tilde(A)\tilde{\boldsymbol{A}}A~ subject to A μ ( x ) A μ ( x ) A μ ( x ) A μ ( x ) A_(mu)(x)rarrA_(mu)(x)-A_{\mu}(x) \rightarrow A_{\mu}(x)-Aμ(x)Aμ(x) μ χ ( x ) μ χ ( x ) del_(mu)chi(x)\partial_{\mu} \chi(x)μχ(x) since such changes in gauge do not alter the dynamics of the electromagnetic field, nor the things coupled to that field.
Example 46.1
As discussed in Chapter 42, we choose the Lorenz gauge, which means we define a new, but physically equivalent, gauge field with components A μ A μ A_(mu)^(')A_{\mu}^{\prime}Aμ which obey the constraint μ A μ = 0 μ A μ = 0 del_(mu)A^('mu)=0\partial_{\mu} A^{\prime \mu}=0μAμ=0, helpfully knocking out the first term in eqn 46.1. This leaves us with a simpler looking equation of motion
2 A μ = 0 2 A μ = 0 del^(2)A^('mu)=0\partial^{2} A^{\prime \mu}=02Aμ=0
whose solutions are plane waves of the form 3 3 ^(3){ }^{3}3
(46.3) A μ = Re [ ϵ μ ( k ) e i k x ] (46.3) A μ = Re ϵ μ ( k ) e i k x {:(46.3)A^(mu)=Re[epsilon^(mu)(k)e^(ik*x)]:}\begin{equation*} A^{\mu}=\operatorname{Re}\left[\epsilon^{\mu}(\boldsymbol{k}) \mathrm{e}^{\mathrm{i} \boldsymbol{k} \cdot \boldsymbol{x}}\right] \tag{46.3} \end{equation*}(46.3)Aμ=Re[ϵμ(k)eikx]
where ω = | k | ω = | k | omega=| vec(k)|\omega=|\vec{k}|ω=|k| and Re [ ] Re [ ] Re[]\operatorname{Re}[]Re[] reminds us to take the real part of a complex expression. However, we also saw in Chapter 42 that since the Lorenz gauge still leaves some freedom, 4 4 ^(4){ }^{4}4 we could then impose the Coulomb gauge, upgrading to a new gauge field with components A μ A μ A_(mu)^('')A_{\mu}^{\prime \prime}Aμ where A 0 = 0 A 0 = 0 A_(0)^('')=0A_{0}^{\prime \prime}=0A0=0. With this further choice, the Lorenz condition then becomes A = 0 A = 0 vec(grad)* vec(A)^('')=0\vec{\nabla} \cdot \vec{A}^{\prime \prime}=0A=0, which further reduces the number of independent field components by one. This makes it clear that although the electromagnetic field has four components, the physics allows only two independent components.
The equations of motion 5 5 ^(5){ }^{5}5 in the Lorenz gauge read 2 A μ = 0 2 A μ = 0 del^(2)A^(mu)=0\partial^{2} A^{\mu}=02Aμ=0, which, with A 0 = 0 A 0 = 0 A^(0)=0A^{0}=0A0=0, has plane wave solutions A = ϵ e i k x A = ϵ e i k x vec(A)= vec(epsilon)e^(ik*x)\vec{A}=\vec{\epsilon} \mathrm{e}^{\mathrm{i} k \cdot x}A=ϵeikx. The equation encoding the Coulomb gauge condition, A = 0 A = 0 vec(grad)* vec(A)=0\vec{\nabla} \cdot \vec{A}=0A=0, leads to
(46.4) k A = k ϵ = 0 (46.4) k A = k ϵ = 0 {:(46.4) vec(k)* vec(A)= vec(k)* vec(epsilon)=0:}\begin{equation*} \vec{k} \cdot \vec{A}=\vec{k} \cdot \vec{\epsilon}=0 \tag{46.4} \end{equation*}(46.4)kA=kϵ=0
which tells us that the direction of propagation of the wave is perpendicular to the polarization ϵ ϵ vec(epsilon)\vec{\epsilon}ϵ, i.e. the wave is transverse. For a wave propagating along z z zzz with null momentum components k μ = ( | k | , 0 , 0 , | k | ) k μ = ( | k | , 0 , 0 , | k | ) k^(mu)=(| vec(k)|,0,0,| vec(k)|)k^{\mu}=(|\vec{k}|, 0,0,|\vec{k}|)kμ=(|k|,0,0,|k|), the components of the electromagnetic field must then be functions of the form
A t = 0 , A x = A x ( ω t + | k | z ) , A y = A y ( ω t + | k | z ) , A z = 0 A t = 0 , A x = A x ( ω t + | k | z ) , A y = A y ( ω t + | k | z ) , A z = 0 A^(t)=0,quadA^(x)=A^(x)(-omega t+| vec(k)|z),quadA^(y)=A^(y)(-omega t+| vec(k)|z),quadA^(z)=0A^{t}=0, \quad A^{x}=A^{x}(-\omega t+|\vec{k}| z), \quad A^{y}=A^{y}(-\omega t+|\vec{k}| z), \quad A^{z}=0At=0,Ax=Ax(ωt+|k|z),Ay=Ay(ωt+|k|z),Az=0
Comparing our plane wave, we spot solutions such as A j = ϵ j ( k ) e i ( ω t | k | z ) A j = ϵ j ( k ) e i ( ω t | k | z ) A^(j)=epsilon^(j)(k)e^(-i(omega t-| vec(k)|z))A^{j}=\epsilon^{j}(\boldsymbol{k}) \mathrm{e}^{-\mathrm{i}(\omega t-|\vec{k}| z)}Aj=ϵj(k)ei(ωt|k|z), for j = x j = x j=xj=xj=x and y y yyy, with 6 ω / | k | = 1 6 ω / | k | = 1 ^(6)omega//| vec(k)|=1{ }^{6} \omega /|\vec{k}|=16ω/|k|=1. Putting everything together, we see that two possible choices of basis polarization vectors are simply
(46.6) ϵ 1 ( k ) = ( 1 0 0 ) , ϵ 2 ( k ) = ( 0 1 0 ) (46.6) ϵ 1 ( k ) = 1 0 0 , ϵ 2 ( k ) = 0 1 0 {:(46.6) vec(epsilon)_(1)(k)=([1],[0],[0])","quad vec(epsilon)_(2)(k)=([0],[1],[0]):}\vec{\epsilon}_{1}(\boldsymbol{k})=\left(\begin{array}{l} 1 \tag{46.6}\\ 0 \\ 0 \end{array}\right), \quad \vec{\epsilon}_{2}(\boldsymbol{k})=\left(\begin{array}{l} 0 \\ 1 \\ 0 \end{array}\right)(46.6)ϵ1(k)=(100),ϵ2(k)=(010)
corresponding to linear polarization along x x xxx or y y yyy, respectively. This explains, in classical terms, the propagation of light waves in the electromagnetic field. 7 7 ^(7){ }^{7}7
Now we repeat the argument for gravitation in the weak-field limit. Using the gravitational version of the Lorenz gauge ν h ¯ μ ν = 0 ν h ¯ μ ν = 0 del^(nu) bar(h)_(mu nu)=0\partial^{\nu} \bar{h}_{\mu \nu}=0νh¯μν=0, we have a wave equation for gravity in the absence of sources, that says 8 8 ^(8){ }^{8}8
(46.8) 2 h ¯ μ ν = 0 (46.8) 2 h ¯ μ ν = 0 {:(46.8)del^(2) bar(h)_(mu nu)=0:}\begin{equation*} \partial^{2} \bar{h}_{\mu \nu}=0 \tag{46.8} \end{equation*}(46.8)2h¯μν=0
The good news is that this looks a lot like the wave equation for the electromagnetic field. It does however involve a second index, meaning that the polarizations of the fields that solve this wave equation have to be represented as square matrices, rather than simply as column vectors, as we had in the case of the electromagnetic field.
Let's consider an ansatz in the form of a plane gravitational wave. This can be written as
(46.9) h ¯ μ ν = Re [ A μ ν e i x ] (46.9) h ¯ μ ν = Re A μ ν e i x {:(46.9) bar(h)_(mu nu)=Re[A_(mu nu)e^(i*x)]:}\begin{equation*} \bar{h}_{\mu \nu}=\operatorname{Re}\left[A_{\mu \nu} \mathrm{e}^{\mathrm{i} \cdot \boldsymbol{x}}\right] \tag{46.9} \end{equation*}(46.9)h¯μν=Re[Aμνeix]
where Re [ ] Re [ ] Re[]\operatorname{Re}[]Re[] again reminds us that we must take the real part of this plane wave solution to describe the physical amplitude of the wave.

Example 46.2

For this to work within the Lorenz gauge we require ν h ¯ μ ν = 0 ν h ¯ μ ν = 0 del^(nu) bar(h)_(mu nu)=0\partial^{\nu} \bar{h}_{\mu \nu}=0νh¯μν=0, or
(46.10) h ¯ μ ν , ν = i k ν A μ ν e i k x = 0 (46.10) h ¯ μ ν , ν = i k ν A μ ν e i k x = 0 {:(46.10) bar(h)_(mu nu)^(,nu)=ik^(nu)A_(mu nu)e^(ik*x)=0:}\begin{equation*} \bar{h}_{\mu \nu}^{, \nu}=\mathrm{i} k^{\nu} A_{\mu \nu} \mathrm{e}^{\mathrm{i} \boldsymbol{k} \cdot \boldsymbol{x}}=0 \tag{46.10} \end{equation*}(46.10)h¯μν,ν=ikνAμνeikx=0
To satisfy the field equations we must also 2 h ¯ μ ν = 0 2 h ¯ μ ν = 0 del^(2) bar(h)_(mu nu)=0\partial^{2} \bar{h}_{\mu \nu}=02h¯μν=0 (eqn 46.8), so that
k σ k σ A μ ν e i k x = 0 k σ k σ A μ ν e i k x = 0 -k_(sigma)k^(sigma)A_(mu nu)e^(ik*x)=0-k_{\sigma} k^{\sigma} A_{\mu \nu} \mathrm{e}^{\mathrm{i} k \cdot x}=0kσkσAμνeikx=0
3 3 ^(3){ }^{3}3 The term k μ x μ k μ x μ k_(mu)x^(mu)k_{\mu} x^{\mu}kμxμ is written here as k x k x k*x\boldsymbol{k} \cdot \boldsymbol{x}kx. We also drop the prime on the field A μ A μ A^('mu)A^{\prime \mu}Aμ.
4 4 ^(4){ }^{4}4 Recall that this is because we can make a further shift A μ A μ = A μ A μ A μ = A μ A_(mu)^(')rarrA_(mu)^('')=A_(mu)^(')-A_{\mu}^{\prime} \rightarrow A_{\mu}^{\prime \prime}=A_{\mu}^{\prime}-AμAμ=Aμ μ ξ μ ξ del_(mu)xi\partial_{\mu} \xiμξ as long as 2 ξ = 0 2 ξ = 0 del^(2)xi=0\partial^{2} \xi=02ξ=0 (so that both A μ A μ A_(mu)^(')A_{\mu}^{\prime}Aμ and A μ A μ A_(mu)^('')A_{\mu}^{\prime \prime}Aμ satisfy the Lorenz condition).
5 5 ^(5){ }^{5}5 We are now going to focus on A μ A μ A_(mu)^('')A_{\mu}^{\prime \prime}Aμ, so will drop the double prime from now on. We also suppress the Re[] notation, which is assumed for the wave solutions.
6 6 ^(6){ }^{6}6 Of course, this implies that in SI units ω / | k | = c ω / | k | = c omega//| vec(k)|=c\omega /|\vec{k}|=cω/|k|=c.
7 7 ^(7){ }^{7}7 The polarization vectors introduced here carry the information about the here carry the information about the
spin state of the photon. Further asspin state of the photon. Further as-
pects of the quantum-mechanical treatpects of the quantum-mechanical treat-
ment are the topic of the following ment ar
8 8 ^(8){ }^{8}8 Remember that when sources are present, the key equation is
(46.7) 2 h ¯ μ ν = 16 π T μ ν (46.7) 2 h ¯ μ ν = 16 π T μ ν {:(46.7)-del^(2) bar(h)_(mu nu)=16 piT_(mu nu):}\begin{equation*} -\partial^{2} \bar{h}_{\mu \nu}=16 \pi T_{\mu \nu} \tag{46.7} \end{equation*}(46.7)2h¯μν=16πTμν
We will return to this later when we put the sources back in, see eqn 46.39.
9 9 ^(9){ }^{9}9 The following argument will make this statement a little more convincing. Let's consider a gravitational plane wave in h ¯ μ ν h ¯ μ ν bar(h)_(mu nu)\bar{h}_{\mu \nu}h¯μν, which are constant on a surface on which its phase k x = k μ x μ k x = k μ x μ k*x=k_(mu)x^(mu)\boldsymbol{k} \cdot \boldsymbol{x}=k_{\mu} x^{\mu}kx=kμxμ surface on which its phase k x = k μ x μ k x = k μ x μ k*x=k_(mu)x^(mu)\boldsymbol{k} \cdot \boldsymbol{x}=k_{\mu} x^{\mu}kx=kμxμ
is constant. A photon moving in the is constant. A photon moving in the
direction of the null vector k k k\boldsymbol{k}k travels on the curve
(46.13) x μ ( λ ) = k μ λ + l μ (46.13) x μ ( λ ) = k μ λ + l μ {:(46.13)x^(mu)(lambda)=k^(mu)lambda+l^(mu):}\begin{equation*} x^{\mu}(\lambda)=k^{\mu} \lambda+l^{\mu} \tag{46.13} \end{equation*}(46.13)xμ(λ)=kμλ+lμ
where l μ l μ l^(mu)l^{\mu}lμ are the components of a constant vector and λ λ lambda\lambdaλ parametrizes the curve. Dotting the equation for the curve with k μ k μ k_(mu)k_{\mu}kμ and noting k k = 0 k k = 0 k*k=0\boldsymbol{k} \cdot \boldsymbol{k}=0kk=0, we find
(46.14) k μ x μ = k μ l μ = const. (46.14) k μ x μ = k μ l μ =  const.  {:(46.14)k_(mu)x^(mu)=k_(mu)l^(mu)=" const. ":}\begin{equation*} k_{\mu} x^{\mu}=k_{\mu} l^{\mu}=\text { const. } \tag{46.14} \end{equation*}(46.14)kμxμ=kμlμ= const. 
This implies that the photon wave and gravitational wave share the same surfaces on which their respective phases are constant, and in fact their respective phases can only differ by a constant scalar value. Thus, the two waves move essentially in lockstep. We can therefore conclude that the gravitational wave travels at the speed of light with k k vec(k)\vec{k}k giving its direction of travel.
10 10 ^(10){ }^{10}10 The analogous expression for electromagnetism was 2 ξ = 0 2 ξ = 0 del^(2)xi=0\partial^{2} \xi=02ξ=0. As you might expect, the gravitational case looks very similar but has an extra index.
As a consequence of the last example, we have that the amplitude A μ ν A μ ν A_(mu nu)A_{\mu \nu}Aμν and wavevector k μ k μ k^(mu)k^{\mu}kμ components obey two constraints
(46.12) A α μ k μ = 0 , k k = 0 . (46.12) A α μ k μ = 0 , k k = 0 . {:(46.12)A_(alpha mu)k^(mu)=0","quad k*k=0.:}\begin{equation*} A_{\alpha \mu} k^{\mu}=0, \quad \boldsymbol{k} \cdot \boldsymbol{k}=0 . \tag{46.12} \end{equation*}(46.12)Aαμkμ=0,kk=0.
We conclude from the second expression that the gravitational wave is null, implying that the gravitational field propagates at the speed of light. 9 9 ^(9){ }^{9}9 From the first condition we have that the wave's amplitude is orthogonal to its direction, making it a transverse plane wave. Analogous to the electromagnetic plane wave travelling along the z z zzz-direction, we have that the components of the gravitational field are given by functions
(46.15) h ¯ x x = h ¯ x x ( ω t + | k | z ) , h ¯ x y = h ¯ y x = h ¯ x y ( ω t + | k | z ) , h ¯ y y = h ¯ y y ( ω t + | k | z ) , h ¯ μ z = 0 for all μ . (46.15) h ¯ x x = h ¯ x x ( ω t + | k | z ) , h ¯ x y = h ¯ y x = h ¯ x y ( ω t + | k | z ) , h ¯ y y = h ¯ y y ( ω t + | k | z ) , h ¯ μ z = 0  for all  μ . {:[(46.15) bar(h)_(xx)= bar(h)_(xx)(-omega t+|k|z)","quad bar(h)_(xy)= bar(h)_(yx)= bar(h)_(xy)(-omega t+|k|z)","],[ bar(h)_(yy)= bar(h)_(yy)(-omega t+|k|z)","quad bar(h)_(mu z)=0quad" for all "mu.]:}\begin{gather*} \bar{h}_{x x}=\bar{h}_{x x}(-\omega t+|k| z), \quad \bar{h}_{x y}=\bar{h}_{y x}=\bar{h}_{x y}(-\omega t+|k| z), \tag{46.15}\\ \bar{h}_{y y}=\bar{h}_{y y}(-\omega t+|k| z), \quad \bar{h}_{\mu z}=0 \quad \text { for all } \mu . \end{gather*}(46.15)h¯xx=h¯xx(ωt+|k|z),h¯xy=h¯yx=h¯xy(ωt+|k|z),h¯yy=h¯yy(ωt+|k|z),h¯μz=0 for all μ.
Also by analogy with the plane wave, we would like to make a further choice of gauge to guarantee that h ¯ μ 0 = 0 h ¯ μ 0 = 0 bar(h)^(mu0)=0\bar{h}^{\mu 0}=0h¯μ0=0. It turns out that we can do exactly this, as we shall see in the next section.

46.2 Lorenz gauge for gravitational waves

We have already chosen the Lorenz gauge to guarantee the wave equation but, just as in the electromagnetic case, this doesn't exhaust the gauge freedom. Remembering that the gauge transformation we are considering is x μ x μ + ξ μ x μ x μ + ξ μ x^(mu)rarrx^(mu)+xi^(mu)x^{\mu} \rightarrow x^{\mu}+\xi^{\mu}xμxμ+ξμ, we note by analogy with electromagnetism that we can still obey Lorenz gauge as long as we have 10 10 ^(10){ }^{10}10
(46.16) 2 ξ α = 0 (46.16) 2 ξ α = 0 {:(46.16)del^(2)xi_(alpha)=0:}\begin{equation*} \partial^{2} \xi_{\alpha}=0 \tag{46.16} \end{equation*}(46.16)2ξα=0
Gravitational waves are a little more complicated than their electromagnetic cousins, so we shall proceed step by step.

Example 46.3

We choose ξ α = B α e i x ξ α = B α e i x xi_(alpha)=B_(alpha)e^(i**x)\xi_{\alpha}=B_{\alpha} \mathrm{e}^{\mathrm{i} \cdot \cdot x}ξα=Bαeix, so that the transformation is oscillatory in spacetime with amplitude B α B α B_(alpha)B_{\alpha}Bα. The usual change in h h h\boldsymbol{h}h resulting from a gauge transformation is given by
which means, for the trace-reversed components, that
(46.18) h ¯ α β = h ¯ α β ξ α , β ξ β , α + η α β ξ , μ μ . (46.18) h ¯ α β = h ¯ α β ξ α , β ξ β , α + η α β ξ , μ μ . {:(46.18) bar(h)_(alpha beta)^(')= bar(h)_(alpha beta)-xi_(alpha,beta)-xi_(beta,alpha)+eta_(alpha beta)xi_(,mu)^(mu).:}\begin{equation*} \bar{h}_{\alpha \beta}^{\prime}=\bar{h}_{\alpha \beta}-\xi_{\alpha, \beta}-\xi_{\beta, \alpha}+\eta_{\alpha \beta} \xi_{, \mu}^{\mu} . \tag{46.18} \end{equation*}(46.18)h¯αβ=h¯αβξα,βξβ,α+ηαβξ,μμ.
Substituting the gauge choice gives a condition on the solution that
(46.19) A α β = A α β i ( B α k β + B β k α η α β B μ k μ ) (46.19) A α β = A α β i B α k β + B β k α η α β B μ k μ {:(46.19)A_(alpha beta)^(')=A_(alpha beta)-i(B_(alpha)k_(beta)+B_(beta)k_(alpha)-eta_(alpha beta)B^(mu)k_(mu)):}\begin{equation*} A_{\alpha \beta}^{\prime}=A_{\alpha \beta}-\mathrm{i}\left(B_{\alpha} k_{\beta}+B_{\beta} k_{\alpha}-\eta_{\alpha \beta} B^{\mu} k_{\mu}\right) \tag{46.19} \end{equation*}(46.19)Aαβ=Aαβi(Bαkβ+BβkαηαβBμkμ)
This satisfies our previous constraint k α A α β = 0 k α A α β = 0 k^(alpha)A_(alpha beta)^(')=0k^{\alpha} A_{\alpha \beta}^{\prime}=0kαAαβ=0, as long as A α β A α β A_(alpha beta)A_{\alpha \beta}Aαβ does too.
The amplitude of the transformation B α B α B_(alpha)B_{\alpha}Bα is then chosen in such a way as to impose two additional (highly simplifying) constraints on the amplitude components A μ ν A μ ν A_(mu nu)A_{\mu \nu}Aμν of our wave-like solution. These are
(46.20) A α α = 0 , A α β u β = 0 , (46.20) A α α = 0 , A α β u β = 0 , {:(46.20)A^(alpha)_(alpha)=0","quadA_(alpha beta)u^(beta)=0",":}\begin{equation*} A^{\alpha}{ }_{\alpha}=0, \quad A_{\alpha \beta} u^{\beta}=0, \tag{46.20} \end{equation*}(46.20)Aαα=0,Aαβuβ=0,
where, in the second expression, u β u β u^(beta)u^{\beta}uβ are the components of a fixed velocity. The first condition tells us that the wave is traceless, which means that, in this gauge, h μ ν = h ¯ μ ν h μ ν = h ¯ μ ν h_(mu nu)= bar(h)_(mu nu)h_{\mu \nu}=\bar{h}_{\mu \nu}hμν=h¯μν. The second says that the wave is orthogonal to a velocity vector u u u\boldsymbol{u}u. This state of affairs is known as transverse-traceless gauge.
Example 46.4
Choose a local inertial frame in which u u u\boldsymbol{u}u has components u μ = ( 1 , 0 , 0 , 0 ) u μ = ( 1 , 0 , 0 , 0 ) u^(mu)=(1,0,0,0)u^{\mu}=(1,0,0,0)uμ=(1,0,0,0). We then have from A α β u β = 0 A α β u β = 0 A_(alpha beta)u^(beta)=0A_{\alpha \beta} u^{\beta}=0Aαβuβ=0 (eqn 46.20) the condition that we wanted, that A α 0 = 0 A α 0 = 0 A_(alpha0)=0A_{\alpha 0}=0Aα0=0. We arrange for the wave to travel along z z zzz, so we have k k k\boldsymbol{k}k with components k μ = k μ = k^(mu)=k^{\mu}=kμ= ( | k | , 0 , 0 , | k | ) ( | k | , 0 , 0 , | k | ) (| vec(k)|,0,0,| vec(k)|)(|\vec{k}|, 0,0,|\vec{k}|)(|k|,0,0,|k|). This means from A α β k β = 0 A α β k β = 0 A_(alpha beta)k^(beta)=0A_{\alpha \beta} k^{\beta}=0Aαβkβ=0 (eqn 46.12) that A α z = 0 A α z = 0 A_(alpha z)=0A_{\alpha z}=0Aαz=0 too. We therefore have a possibility of non-zero matrix elements for A x x , A y y , A x y A x x , A y y , A x y A_(xx),A_(yy),A_(xy)A_{x x}, A_{y y}, A_{x y}Axx,Ayy,Axy and A y x A y x A_(yx)A_{y x}Ayx. Since the wave is traceless, we must have A x x = A y y A x x = A y y A_(xx)=-A_(yy)A_{x x}=-A_{y y}Axx=Ayy. By symmetry A x y = A y x A x y = A y x A_(xy)=A_(yx)A_{x y}=A_{y x}Axy=Ayx. We therefore have, in this frame, the amplitude components
(46.21) A α β = ( 0 0 0 0 0 A x x A x y 0 0 A x y A x x 0 0 0 0 0 ) (46.21) A α β = 0 0 0 0 0 A x x A x y 0 0 A x y A x x 0 0 0 0 0 {:(46.21)A_(alpha beta)=([0,0,0,0],[0,A_(xx),A_(xy),0],[0,A_(xy),-A_(xx),0],[0,0,0,0]):}A_{\alpha \beta}=\left(\begin{array}{cccc} 0 & 0 & 0 & 0 \tag{46.21}\\ 0 & A_{x x} & A_{x y} & 0 \\ 0 & A_{x y} & -A_{x x} & 0 \\ 0 & 0 & 0 & 0 \end{array}\right)(46.21)Aαβ=(00000AxxAxy00AxyAxx00000)
We then have a simplified solution to the wave equation h ¯ α β = A α β e i k x h ¯ α β = A α β e i k x bar(h)_(alpha beta)=A_(alpha beta)e^(ik*x)\bar{h}_{\alpha \beta}=A_{\alpha \beta} \mathrm{e}^{\mathrm{i} k \cdot x}h¯αβ=Aαβeikx, with ω = | k | ω = | k | omega=| vec(k)|\omega=|\vec{k}|ω=|k|. Remembering that g μ ν = η μ ν + h μ ν g μ ν = η μ ν + h μ ν g_(mu nu)=eta_(mu nu)+h_(mu nu)g_{\mu \nu}=\eta_{\mu \nu}+h_{\mu \nu}gμν=ημν+hμν, and also that h ¯ μ ν = h μ ν h ¯ μ ν = h μ ν bar(h)_(mu nu)=h_(mu nu)\bar{h}_{\mu \nu}=h_{\mu \nu}h¯μν=hμν with our choice of gauge, we deduce that this solution results in a metric line element
d s 2 = d t 2 + ( 1 + A x x e i ( ω t | k | z ) ) d x 2 + 2 A x y e i ( ω t | k | z ) d x d y (46.22) + ( 1 A x x e i ( ω t | k | z ) ) d y 2 + d z 2 d s 2 = d t 2 + 1 + A x x e i ( ω t | k | z ) d x 2 + 2 A x y e i ( ω t | k | z ) d x d y (46.22) + 1 A x x e i ( ω t | k | z ) d y 2 + d z 2 {:[ds^(2)=-dt^(2)+(1+A_(xx)e^(-i(omega t-|k|z)))dx^(2)+2A_(xy)e^(-i(omega t-|k|z))dxdy],[(46.22)+(1-A_(xx)e^(-i(omega t-|k|z)))dy^(2)+dz^(2)]:}\begin{align*} \mathrm{d} s^{2}= & -\mathrm{d} t^{2}+\left(1+A_{x x} \mathrm{e}^{-\mathrm{i}(\omega t-|k| z)}\right) \mathrm{d} x^{2}+2 A_{x y} \mathrm{e}^{-\mathrm{i}(\omega t-|k| z)} \mathrm{d} x \mathrm{~d} y \\ & +\left(1-A_{x x} \mathrm{e}^{-\mathrm{i}(\omega t-|k| z)}\right) \mathrm{d} y^{2}+\mathrm{d} z^{2} \tag{46.22} \end{align*}ds2=dt2+(1+Axxei(ωt|k|z))dx2+2Axyei(ωt|k|z)dx dy(46.22)+(1Axxei(ωt|k|z))dy2+dz2
The two independent solutions represented here can be disentangled as follows. If A x y = 0 A x y = 0 A_(xy)=0A_{x y}=0Axy=0, then our metric reduces to
d s 2 = d t 2 + ( 1 + A x x e i ( ω t | k | z ) ) d x 2 + ( 1 A x x e i ( ω t | k | z ) ) d y 2 + d z 2 d s 2 = d t 2 + 1 + A x x e i ( ω t | k | z ) d x 2 + 1 A x x e i ( ω t | k | z ) d y 2 + d z 2 ds^(2)=-dt^(2)+(1+A_(xx)e^(-i(omega t-|k|z)))dx^(2)+(1-A_(xx)e^(-i(omega t-|k|z)))dy^(2)+dz^(2)\mathrm{d} s^{2}=-\mathrm{d} t^{2}+\left(1+A_{x x} \mathrm{e}^{-\mathrm{i}(\omega t-|k| z)}\right) \mathrm{d} x^{2}+\left(1-A_{x x} \mathrm{e}^{-\mathrm{i}(\omega t-|k| z)}\right) \mathrm{d} y^{2}+\mathrm{d} z^{2}ds2=dt2+(1+Axxei(ωt|k|z))dx2+(1Axxei(ωt|k|z))dy2+dz2. (46.23) On the other hand, if A x x = 0 A x x = 0 A_(xx)=0A_{x x}=0Axx=0, our metric reduces to
(46.24) d s 2 = d t 2 + 2 A x y e i ( ω t | k | z ) d x d y + d z 2 (46.24) d s 2 = d t 2 + 2 A x y e i ( ω t | k | z ) d x d y + d z 2 {:(46.24)ds^(2)=-dt^(2)+2A_(xy)e^(-i(omega t-|k|z))dxdy+dz^(2):}\begin{equation*} \mathrm{d} s^{2}=-\mathrm{d} t^{2}+2 A_{x y} \mathrm{e}^{-\mathrm{i}(\omega t-|k| z)} \mathrm{d} x \mathrm{~d} y+\mathrm{d} z^{2} \tag{46.24} \end{equation*}(46.24)ds2=dt2+2Axyei(ωt|k|z)dx dy+dz2
Each of these solutions represents a plane gravitational wave, and eqns 46.23 and 46.24 are related to each other by a 45 45 45^(@)45^{\circ}45 rotation.
The metric line element has some oscillatory terms in it, but what does this mean? Does this mean that individual masses suspended in space will bob up and down as a gravitational wave goes past, just like boats on the ocean do when ocean waves go past? It's a bit more complicated than that, as shown in the next example.
Example 46.5
Place a test mass at rest at the origin. Its velocity is then x ˙ μ = ( 1 , 0 , 0 , 0 ) x ˙ μ = ( 1 , 0 , 0 , 0 ) x^(˙)^(mu)=(1,0,0,0)\dot{x}^{\mu}=(1,0,0,0)x˙μ=(1,0,0,0). The geodesic equation, eqn 8.24 , is x ¨ μ + Γ α β μ x ˙ α x ˙ β = 0 x ¨ μ + Γ α β μ x ˙ α x ˙ β = 0 x^(¨)^(mu)+Gamma_(alpha beta)^(mu)x^(˙)^(alpha)x^(˙)^(beta)=0\ddot{x}^{\mu}+\Gamma_{\alpha \beta}^{\mu} \dot{x}^{\alpha} \dot{x}^{\beta}=0x¨μ+Γαβμx˙αx˙β=0, so that
(46.25) x ¨ i + Γ 00 i x ˙ 0 x ˙ 0 = x ¨ i + Γ 00 i = 0 (46.25) x ¨ i + Γ 00 i x ˙ 0 x ˙ 0 = x ¨ i + Γ 00 i = 0 {:(46.25)x^(¨)^(i)+Gamma_(00)^(i)x^(˙)^(0)x^(˙)^(0)=x^(¨)^(i)+Gamma_(00)^(i)=0:}\begin{equation*} \ddot{x}^{i}+\Gamma_{00}^{i} \dot{x}^{0} \dot{x}^{0}=\ddot{x}^{i}+\Gamma_{00}^{i}=0 \tag{46.25} \end{equation*}(46.25)x¨i+Γ00ix˙0x˙0=x¨i+Γ00i=0
However, for a gravitational wave we must have Γ 00 i = 0 Γ 00 i = 0 Gamma_(00)^(i)=0\Gamma_{00}^{i}=0Γ00i=0 since Γ a b i = 1 2 η i c ( a h c b + Γ a b i = 1 2 η i c a h c b + Gamma_(ab)^(i)=(1)/(2)eta^(ic)(del_(a)h_(cb)+:}\Gamma_{a b}^{i}=\frac{1}{2} \eta^{i c}\left(\partial_{a} h_{c b}+\right.Γabi=12ηic(ahcb+ b h c a c h a b ) = 0 b h c a c h a b = 0 {:del_(b)h_(ca)-del_(c)h_(ab))=0\left.\partial_{b} h_{c a}-\partial_{c} h_{a b}\right)=0bhcachab)=0 because h 0 a = 0 h 0 a = 0 h_(0a)=0h_{0 a}=0h0a=0. This implies that x ¨ i = 0 x ¨ i = 0 x^(¨)^(i)=0\ddot{x}^{i}=0x¨i=0 so that the particle doesn't move. Oh dear; this is not what we wanted!
However, the fact that the coordinate of our test mass does not change as the gravitational wave rolls past does not mean anything. By now, we have learned to be suspicious of coordinates which can be chosen in lots of different ways. We know the metric line element does oscillate, so for example if we put a first test mass at the origin and a second test mass displaced a distance L L LLL in the x x xxx-direction, then the distance between them should be
0 L g x x d x = Re [ L 1 + A x x e i ( ω t | k | z ) ] L + L A x x 2 cos ( ω t | k | z ) 0 L g x x d x = Re L 1 + A x x e i ( ω t | k | z ) L + L A x x 2 cos ( ω t | k | z ) int_(0)^(L)sqrt(g_(xx))dx=Re[Lsqrt(1+A_(xx)e^(-i(omega t-|k|z)))]~~L+(LA_(xx))/(2)*cos(omega t-|k|z)\int_{0}^{L} \sqrt{g_{x x}} \mathrm{~d} x=\operatorname{Re}\left[L \sqrt{1+A_{x x} \mathrm{e}^{-\mathrm{i}(\omega t-|k| z)}}\right] \approx L+\frac{L A_{x x}}{2} \cdot \cos (\omega t-|k| z)0Lgxx dx=Re[L1+Axxei(ωt|k|z)]L+LAxx2cos(ωt|k|z).
Happily this does oscillate, and also gives a method of detecting gravitational waves: by measuring the distance between pairs of masses as a function of time.
We can therefore understand gravitational waves by assessing their influence on groups of tiny test masses. Let's therefore examine in more detail the geodesic deviation of a set of particles. Geodesic deviation is described by the equation
(46.27) D 2 n d τ 2 + R ( , u , n , u ) = 0 (46.27) D 2 n d τ 2 + R ( , u , n , u ) = 0 {:(46.27)(D^(2)n)/((d)tau^(2))+R(","u","n","u)=0:}\begin{equation*} \frac{D^{2} \boldsymbol{n}}{\mathrm{~d} \tau^{2}}+\boldsymbol{R}(, \boldsymbol{u}, \boldsymbol{n}, \boldsymbol{u})=0 \tag{46.27} \end{equation*}(46.27)D2n dτ2+R(,u,n,u)=0
where n n n\boldsymbol{n}n is the separation vector of the particles and the velocity u u u\boldsymbol{u}u is tangent to the streamlines formed by the geodesics. We will work in the local frame of the particles, amounting to a choice of the components of u u u\boldsymbol{u}u as u μ = ( 1 , 0 , 0 , 0 ) u μ = ( 1 , 0 , 0 , 0 ) u^(mu)=(1,0,0,0)u^{\mu}=(1,0,0,0)uμ=(1,0,0,0), so all we need to do is compute the relevant components of R R R\boldsymbol{R}R from the components of the h h h\boldsymbol{h}h field.

Example 46.6

With our choice of u u u\boldsymbol{u}u the geodesic deviation expression becomes the component equation
(46.28) D 2 n μ d τ 2 = R 0 α 0 μ n α (46.28) D 2 n μ d τ 2 = R 0 α 0 μ n α {:(46.28)(D^(2)n^(mu))/(dtau^(2))=-R_(0alpha0)^(mu)n^(alpha):}\begin{equation*} \frac{D^{2} n^{\mu}}{\mathrm{d} \tau^{2}}=-R_{0 \alpha 0}^{\mu} n^{\alpha} \tag{46.28} \end{equation*}(46.28)D2nμdτ2=R0α0μnα
For simplicity, let's choose n n n\boldsymbol{n}n to initially have components n μ = ( 0 , a , 0 , 0 ) n μ = ( 0 , a , 0 , 0 ) n^(mu)=(0,a,0,0)n^{\mu}=(0, a, 0,0)nμ=(0,a,0,0), implying that the two masses are separated by the spacelike interval a a aaa at the start of the motion. From the last chapter, we have that
(46.29) R α β μ ν = 1 2 ( h α ν , β μ h α μ , β ν + h β μ , α ν h β ν , α μ ) . (46.29) R α β μ ν = 1 2 h α ν , β μ h α μ , β ν + h β μ , α ν h β ν , α μ . {:(46.29)R_(alpha beta mu nu)=(1)/(2)(h_(alpha nu,beta mu)-h_(alpha mu,beta nu)+h_(beta mu,alpha nu)-h_(beta nu,alpha mu)).:}\begin{equation*} R_{\alpha \beta \mu \nu}=\frac{1}{2}\left(h_{\alpha \nu, \beta \mu}-h_{\alpha \mu, \beta \nu}+h_{\beta \mu, \alpha \nu}-h_{\beta \nu, \alpha \mu}\right) . \tag{46.29} \end{equation*}(46.29)Rαβμν=12(hαν,βμhαμ,βν+hβμ,ανhβν,αμ).
Recalling also that we raise and lower indices in the weak-field limit using η μ ν = η μ ν = eta_(mu nu)=\eta_{\mu \nu}=ημν= diag ( 1 , 1 , 1 , 1 ) diag ( 1 , 1 , 1 , 1 ) diag(-1,1,1,1)\operatorname{diag}(-1,1,1,1)diag(1,1,1,1), we find that the components of the Riemann tensor relevant to the geodesic equation then become
R 0 x 0 x = R x 0 x 0 = 1 2 2 h x x t 2 R 0 x 0 y = R y 0 x 0 = 1 2 2 h x y t 2 (46.30) R 0 y 0 y = R y 0 y 0 = 1 2 2 h y y t 2 = R 0 x 0 x R 0 x 0 x = R x 0 x 0 = 1 2 2 h x x t 2 R 0 x 0 y = R y 0 x 0 = 1 2 2 h x y t 2 (46.30) R 0 y 0 y = R y 0 y 0 = 1 2 2 h y y t 2 = R 0 x 0 x {:[R_(0x0)^(x)=R_(x0x0)=-(1)/(2)(del^(2)h_(xx))/(delt^(2))],[R_(0x0)^(y)=R_(y0x0)=-(1)/(2)(del^(2)h_(xy))/(delt^(2))],[(46.30)R_(0y0)^(y)=R_(y0y0)=-(1)/(2)(del^(2)h_(yy))/(delt^(2))=-R_(0x0)^(x)]:}\begin{align*} & R_{0 x 0}^{x}=R_{x 0 x 0}=-\frac{1}{2} \frac{\partial^{2} h_{x x}}{\partial t^{2}} \\ & R_{0 x 0}^{y}=R_{y 0 x 0}=-\frac{1}{2} \frac{\partial^{2} h_{x y}}{\partial t^{2}} \\ & R_{0 y 0}^{y}=R_{y 0 y 0}=-\frac{1}{2} \frac{\partial^{2} h_{y y}}{\partial t^{2}}=-R_{0 x 0}^{x} \tag{46.30} \end{align*}R0x0x=Rx0x0=122hxxt2R0x0y=Ry0x0=122hxyt2(46.30)R0y0y=Ry0y0=122hyyt2=R0x0x
A further simplification is that, to first order in h μ ν h μ ν h_(mu nu)h_{\mu \nu}hμν we can make the replacement τ = t τ = t tau=t\tau=tτ=t. This means that the separation vector of the particles, originally separated along the x x xxx-direction by an interval a a aaa, obey the equations of motion
(46.31) 2 n x t 2 = 1 2 a 2 h x x t 2 , 2 n y t 2 = 1 2 a 2 h x y t 2 (46.31) 2 n x t 2 = 1 2 a 2 h x x t 2 , 2 n y t 2 = 1 2 a 2 h x y t 2 {:(46.31)(del^(2)n^(x))/(delt^(2))=(1)/(2)a(del^(2)h_(xx))/(delt^(2))","quad(del^(2)n^(y))/(delt^(2))=(1)/(2)a(del^(2)h_(xy))/(delt^(2)):}\begin{equation*} \frac{\partial^{2} n^{x}}{\partial t^{2}}=\frac{1}{2} a \frac{\partial^{2} h_{x x}}{\partial t^{2}}, \quad \frac{\partial^{2} n^{y}}{\partial t^{2}}=\frac{1}{2} a \frac{\partial^{2} h_{x y}}{\partial t^{2}} \tag{46.31} \end{equation*}(46.31)2nxt2=12a2hxxt2,2nyt2=12a2hxyt2
By the same token, two particles initially separated along y y yyy by a spacelike interval a a aaa obey
(46.32) 2 n y t 2 = 1 2 a 2 h x x t 2 , 2 n x t 2 = 1 2 a 2 h x y t 2 (46.32) 2 n y t 2 = 1 2 a 2 h x x t 2 , 2 n x t 2 = 1 2 a 2 h x y t 2 {:(46.32)(del^(2)n^(y))/(delt^(2))=-(1)/(2)a(del^(2)h_(xx))/(delt^(2))","quad(del^(2)n^(x))/(delt^(2))=(1)/(2)a(del^(2)h_(xy))/(delt^(2)):}\begin{equation*} \frac{\partial^{2} n^{y}}{\partial t^{2}}=-\frac{1}{2} a \frac{\partial^{2} h_{x x}}{\partial t^{2}}, \quad \frac{\partial^{2} n^{x}}{\partial t^{2}}=\frac{1}{2} a \frac{\partial^{2} h_{x y}}{\partial t^{2}} \tag{46.32} \end{equation*}(46.32)2nyt2=12a2hxxt2,2nxt2=12a2hxyt2
On integrating these differential equations twice, we find that the separation n i n i n^(i)n^{i}ni of particles can be described in terms of the components of the h h h\boldsymbol{h}h-field in transversetraceless gauge by writing
(46.33) n i = 1 2 h i j TT x j , (46.33) n i = 1 2 h i j TT x j , {:(46.33)n^(i)=(1)/(2)h_(ij)^(TT)x^(j)",":}\begin{equation*} n^{i}=\frac{1}{2} h_{i j}^{\mathrm{TT}} x^{j}, \tag{46.33} \end{equation*}(46.33)ni=12hijTTxj,
where h x x TT = h y y TT = h + h x x TT = h y y TT = h + h_(xx)^(TT)=-h_(yy)^(TT)=h_(+)h_{x x}^{\mathrm{TT}}=-h_{y y}^{\mathrm{TT}}=h_{+}hxxTT=hyyTT=h+and h x y TT = h y x TT = h × h x y TT = h y x TT = h × h_(xy)^(TT)=h_(yx)^(TT)=h_(xx)h_{x y}^{\mathrm{TT}}=h_{y x}^{\mathrm{TT}}=h_{\times}hxyTT=hyxTT=h×.
The equations from the previous example allow us to understand the polarization of the gravitational waves as causing motion of the particles arranged in a circle in Fig. 46.1(a). If the waves have h x y = 0 h x y = 0 h_(xy)=0h_{x y}=0hxy=0 and h x x 0 h x x 0 h_(xx)!=0h_{x x} \neq 0hxx0, then the pattern of displacements corresponds to that shown in Fig. 46.1(b), with the masses moving along the x x xxx and y y yyy directions out of phase by 180 180 180^(@)180^{\circ}180. This is sometimes called the + polarization. If, instead, we have h x x = 0 h x x = 0 h_(xx)=0h_{x x}=0hxx=0 and h x y 0 h x y 0 h_(xy)!=0h_{x y} \neq 0hxy0, then the pattern of displacements is that shown in Fig. 46.1(c). This is simply the pattern from Fig. 46.1(b) rotated by 45 45 45^(@)45^{\circ}45, hence the name: × × xx\times× polarization. Notice how the two different polarizations are related by a 45 45 45^(@)45^{\circ}45 rotation 11 11 ^(11){ }^{11}11 unlike the two electromagnetic linear polarizations, which are related by a 90 90 90^(@)90^{\circ}90 rotation. This is a consequence of the tensorial nature of the ( 0 , 2 ) ( 0 , 2 ) (0,2)(0,2)(0,2) gravitational h h h\boldsymbol{h}h field, as opposed to the 1 -form field A ~ A ~ tilde(A)\tilde{\boldsymbol{A}}A~ that expresses electromagnetism.
Example 46.7
We can define the tidal field E E E\mathcal{E}E with components E i j = R 0 i 0 j E i j = R 0 i 0 j E_(ij)=R_(0i0j)\mathcal{E}_{i j}=R_{0 i 0 j}Eij=R0i0j, which can be written in terms of the two possible polarizations of the waves as
E = 1 2 h ¨ TT (46.34) = 1 2 ( h ¨ + e + + h ¨ × e × ) , E = 1 2 h ¨ TT (46.34) = 1 2 h ¨ + e + + h ¨ × e × , {:[E=-(1)/(2)h^(¨)^(TT)],[(46.34)=-(1)/(2)(h^(¨)_(+)e_(+)+h^(¨)_(xx)e_(xx))","]:}\begin{align*} \mathcal{E} & =-\frac{1}{2} \ddot{\boldsymbol{h}}^{\mathrm{TT}} \\ & =-\frac{1}{2}\left(\ddot{h}_{+} \boldsymbol{e}_{+}+\ddot{h}_{\times} \boldsymbol{e}_{\times}\right), \tag{46.34} \end{align*}E=12h¨TT(46.34)=12(h¨+e++h¨×e×),
where
e + = e x e x e y e y e + = e x e x e y e y e_(+)=e_(x)oxe_(x)-e_(y)oxe_(y)e_{+}=e_{x} \otimes e_{x}-e_{y} \otimes e_{y}e+=exexeyey,
and
e × = e x e y + e y e x e × = e x e y + e y e x e_(xx)=e_(x)oxe_(y)+e_(y)oxe_(x)\boldsymbol{e}_{\times}=\boldsymbol{e}_{x} \otimes \boldsymbol{e}_{y}+\boldsymbol{e}_{y} \otimes \boldsymbol{e}_{x}e×=exey+eyex,
are polarization tensors.
We can also depict the two polarizations by looking at the accelerations that are produced by the waves. This is shown in the diagrams in Fig. 46.2, where the field lines represent the acceleration field D 2 n / d τ 2 D 2 n / d τ 2 D^(2)n//dtau^(2)D^{2} \boldsymbol{n} / \mathrm{d} \tau^{2}D2n/dτ2. The four diagrams display the field at different points in the wave cycle for the two polarizations. These field diagrams are reminiscent of the magnetic field from a quadrupolar magnet and in fact these do indeed represent quadrupolar fields. This point deserves some further exploration, which we will do in the following example, first considering the case of sources of electromagnetic radiation before considering the analogous case of sources of gravitational radiation.

(c)
Fig. 46.1 Polarizations of gravitational waves. (a) A circle of masses. (b) The + polarization. (c) The × × xx\times× polarization.
11 11 ^(11){ }^{11}11 As we found in Example 46.4.
(a)

(b)
Fig. 46.2 Quadrupolar field lines representing the accelerations produced by a gravitational wave (a) with the + polarization and (b) with the × × xx\times× polarization.

Example 46.8

  • (i) Electromagnetic waves: In the study of electromagnetic radiation, the multipole expansion of a source is a useful technique. Consider some object sitting in empty space, and assume it is made up of various charges, perhaps both positive and negative. We are interested in the electromagnetic field at some point distant from this object due to the effect of the charges within the object and their individual motions. The multipole expansion involves writing the charge distribution of this bounded object at a particular instant in time as a sum of terms of increasing complexity. We start off by adding up all the charge in the object and shrinking it to a point; at sufficient distance, the object will after all look like a point charge. The first term is then the monopolar contribution, the next term will be the dipole term, then we have a quadrupolar contribution. The charge in the object might be in some complicated motion, so we will also have some dynamic current-carrying contributions such as the magnetic dipole moment, magnetic quadrupole moment, etc. (no magnetic monopolar term though, since this turns out to be identically zero).
To get electromagnetic radiation out of this object, the charges within need to jiggle around. The acceleration of those charges could then produce electromagnetic waves. For example, suppose we could get the total charge in our object (the monopolar term) to oscillate up and down, we could generate spherically symmetry electromagnetic waves. However, this can't happen because we are not allowed to vary the total charge in our bounded object, charge being a conserved quantity. What we can do is vary the dipolar term in an oscillatory fashion, and this is how transmitters (and aerials) work. We make one end of an object positive and the other end negative by driving a current in one direction, and then by reversing the current we reverse the polarity of the dipole moment, making the formerly positive end negative and the formerly negative end positive. This oscillating dipole produces dipole radiation. Higher multipoles of electromagnetic radiation are possible, but charge conservation forbids the monopole term.
  • (ii) Gravitational waves: In the context of gravitation, we can consider an analogous multipole expansion of the mass distribution of an array of masses m i m i m_(i)m_{i}mi at positions x i x i vec(x)_(i)\vec{x}_{i}xi. (For simplicity, we will put the origin of our coordinates at the centre of mass of this distribution.) We are looking for contributions that aren't conserved, since these can vary and therefore act as the source of gravitational waves.
The monopole contribution is the total mass i m i i m i sum_(i)m_(i)\sum_{i} m_{i}imi (also known as the zeroth mass moment I 0 I 0 I_(0)\mathcal{I}_{0}I0 ), which is constant owing to mass conservation. The dipole moment of the distribution (also known as the first mass moment I 1 I 1 I_(1)\mathcal{I}_{1}I1 ), which is given by i m i x i i m i x i sum_(i)m_(i) vec(x)_(i)\sum_{i} m_{i} \vec{x}_{i}imixi, but this is constant because of momentum conservation. The gravitational analogue of the magnetic moment is a first-order moment involving mass currents. It is given by the angular momentum L = i m i ( x i × x ˙ i ) L = i m i x i × x ˙ i vec(L)=sum_(i)m_(i)( vec(x)_(i)xx vec(x)^(˙)_(i))\vec{L}=\sum_{i} m_{i}\left(\vec{x}_{i} \times \dot{\vec{x}}_{i}\right)L=imi(xi×x˙i). The first-order moment is therefore conserved because of angular momentum conservation. This implies that the lowest order contribution to gravitational radiation can only be from the next term: the quadrupolar field.
In expanding the metric close to the source in the weak-field regime, the 00 component can be expanded as a sum of the mass moments
(46.37) g 00 = 1 + a 0 I 0 r + a 1 I 1 r 2 + a 2 I 2 r 3 + (46.37) g 00 = 1 + a 0 I 0 r + a 1 I 1 r 2 + a 2 I 2 r 3 + {:(46.37)g_(00)=-1+a_(0)(I_(0))/(r)+a_(1)(I_(1))/(r^(2))+a_(2)(I_(2))/(r^(3))+dots:}\begin{equation*} g_{00}=-1+a_{0} \frac{\mathcal{I}_{0}}{r}+a_{1} \frac{\mathcal{I}_{1}}{r^{2}}+a_{2} \frac{\mathcal{I}_{2}}{r^{3}}+\ldots \tag{46.37} \end{equation*}(46.37)g00=1+a0I0r+a1I1r2+a2I2r3+
where I I I_(ℓ)\mathcal{I}_{\ell}I is the \ell th mass moment and a i a i a_(i)a_{i}ai are a set of constants. Similarly, the 0 j 0 j 0j0 j0j components of the metric can be expanded in terms of the current moments
(46.38) g 0 j = b 1 S 1 r 2 + b 2 S 2 r 3 + (46.38) g 0 j = b 1 S 1 r 2 + b 2 S 2 r 3 + {:(46.38)g_(0j)=b_(1)(S_(1))/(r^(2))+b_(2)(S_(2))/(r^(3))+dots:}\begin{equation*} g_{0 j}=b_{1} \frac{\mathcal{S}_{1}}{r^{2}}+b_{2} \frac{\mathcal{S}_{2}}{r^{3}}+\ldots \tag{46.38} \end{equation*}(46.38)g0j=b1S1r2+b2S2r3+
where S S S_(ℓ)\mathcal{S}_{\ell}S are current moments and b i b i b_(i)b_{i}bi are constants. Since for a source of linear dimension L L LLL, we would expect on simple, dimensional grounds that I M L I M L I_(ℓ)prop ML^(ℓ)\mathcal{I}_{\ell} \propto M L^{\ell}IML and S M v L S M v L Sprop MvL^(ℓ)\mathcal{S} \propto M v L^{\ell}SMvL, where M M MMM is mass and v v vvv is velocity. We infer that the leading-order timevarying contribution to g 00 g 00 g_(00)g_{00}g00 is from the quadrupolar mass term I 2 I 2 I_(2)\mathcal{I}_{2}I2, with the other terms contributing higher order corrections. This turns out to be the case.

46.3 Quadrupolar radiation

We will now derive a description of the quadrupolar radiation directly. We rewrite our wave equation with a source of radiation, so that
(46.39) 2 h ¯ μ ν = 16 π G T μ ν (46.39) 2 h ¯ μ ν = 16 π G T μ ν {:(46.39)-del^(2) bar(h)_(mu nu)=16 pi GT_(mu nu):}\begin{equation*} -\partial^{2} \bar{h}_{\mu \nu}=16 \pi G T_{\mu \nu} \tag{46.39} \end{equation*}(46.39)2h¯μν=16πGTμν
where the factor of G G GGG has been restored. By analogy with the electromagnetic case, 12 12 ^(12){ }^{12}12 we can write down the solution to this weak-field gravitation equation as
(46.41) h ¯ μ ν ( t , x ) = 4 G d 3 y T μ ν ( t | x y | , y ) | x y | , (46.41) h ¯ μ ν ( t , x ) = 4 G d 3 y T μ ν ( t | x y | , y ) | x y | , {:(46.41) bar(h)_(mu nu)(t"," vec(x))=4G intd^(3)y(T_(mu nu)(t-|( vec(x))-( vec(y))|,( vec(y))))/(|( vec(x))-( vec(y))|)",":}\begin{equation*} \bar{h}_{\mu \nu}(t, \vec{x})=4 G \int \mathrm{~d}^{3} y \frac{T_{\mu \nu}(t-|\vec{x}-\vec{y}|, \vec{y})}{|\vec{x}-\vec{y}|}, \tag{46.41} \end{equation*}(46.41)h¯μν(t,x)=4G d3yTμν(t|xy|,y)|xy|,
Assuming that the source is compact, so that it is concentrated in a small region Σ Σ Sigma\SigmaΣ of linear size r r rrr a distance R r R r R≫rR \gg rRr away, the spatial components h ¯ i j h ¯ i j bar(h)_(ij)\bar{h}_{i j}h¯ij are given by
(46.42) h ¯ i j ( t ) 4 G R Σ d 3 y T i j ( t R , y ) (46.42) h ¯ i j ( t ) 4 G R Σ d 3 y T i j ( t R , y ) {:(46.42) bar(h)_(ij)(t)~~(4G)/(R)int_(Sigma)d^(3)yT_(ij)(t-R"," vec(y)):}\begin{equation*} \bar{h}_{i j}(t) \approx \frac{4 G}{R} \int_{\Sigma} \mathrm{d}^{3} y T_{i j}(t-R, \vec{y}) \tag{46.42} \end{equation*}(46.42)h¯ij(t)4GRΣd3yTij(tR,y)
The energy-momentum tensor is subject to a conservation law μ T μ ν = μ T μ ν = del_(mu)T^(mu nu)=\partial_{\mu} T^{\mu \nu}=μTμν= 0 (or equivalently T μ ν , μ = 0 T μ ν , μ = 0 T^(mu nu)_(,mu)=0T^{\mu \nu}{ }_{, \mu}=0Tμν,μ=0 ) and including that allows 13 13 ^(13){ }^{13}13 us to rewrite eqn 46.42 as
(46.43) h ¯ i j ( t ) 2 G R 0 2 Σ d 3 y y i y j T 00 ( t R , y ) (46.43) h ¯ i j ( t ) 2 G R 0 2 Σ d 3 y y i y j T 00 ( t R , y ) {:(46.43) bar(h)_(ij)(t)~~(2G)/(R)del_(0)^(2)int_(Sigma)d^(3)yy_(i)y_(j)T^(00)(t-R"," vec(y)):}\begin{equation*} \bar{h}_{i j}(t) \approx \frac{2 G}{R} \partial_{0}^{2} \int_{\Sigma} \mathrm{d}^{3} y y_{i} y_{j} T^{00}(t-R, \vec{y}) \tag{46.43} \end{equation*}(46.43)h¯ij(t)2GR02Σd3yyiyjT00(tR,y)
The integral is just the second moment of the mass distribution, which is (in energy units) the moment of inertia tensor I i j I i j I_(ij)I_{i j}Iij, so we can write this equation in the simplified form
(46.44) h ¯ i j ( t ) 2 G R I ¨ i j ( t R ) . (46.44) h ¯ i j ( t ) 2 G R I ¨ i j ( t R ) . {:(46.44) bar(h)_(ij)(t)~~(2G)/(R)I^(¨)_(ij)(t-R).:}\begin{equation*} \bar{h}_{i j}(t) \approx \frac{2 G}{R} \ddot{I}_{i j}(t-R) . \tag{46.44} \end{equation*}(46.44)h¯ij(t)2GRI¨ij(tR).
This is known as the Einstein quadrupole formula. 14 14 ^(14){ }^{14}14 Note that the moment of inertia tensor I i j = Σ d 3 y ρ y i y j I i j = Σ d 3 y ρ y i y j I_(ij)=int_(Sigma)d^(3)y rhoy_(i)y_(j)I_{i j}=\int_{\Sigma} \mathrm{d}^{3} y \rho y_{i} y_{j}Iij=Σd3yρyiyj differs from the quadrupole moment Q i j = Σ d 3 y ρ ( y i y j 1 3 r 2 δ i j ) = I i j 1 3 Tr I Q i j = Σ d 3 y ρ y i y j 1 3 r 2 δ i j = I i j 1 3 Tr I Q_(ij)=int_(Sigma)d^(3)y rho(y_(i)y_(j)-(1)/(3)r^(2)delta_(ij))=I_(ij)-(1)/(3)Tr IQ_{i j}=\int_{\Sigma} \mathrm{d}^{3} y \rho\left(y_{i} y_{j}-\frac{1}{3} r^{2} \delta_{i j}\right)=I_{i j}-\frac{1}{3} \operatorname{Tr} IQij=Σd3yρ(yiyj13r2δij)=Iij13TrI solely by its trace. Since we are working in the transverse-traceless gauge, we are insensitive to the trace and hence we can think of the moment of inertia as a quadrupole moment.

Example 46.9

Let's put some numbers in to see how big an effect this could be. With a source at, say, R = 100 MPc R = 100 MPc R=100MPcR=100 \mathrm{MPc}R=100MPc away from us, consisting of a pair of black holes, each of mass M = 30 M M = 30 M M=30M_(o.)M=30 M_{\odot}M=30M orbiting each other at f = ω / ( 2 π ) = 10 Hz f = ω / ( 2 π ) = 10 Hz f=omega//(2pi)=10Hzf=\omega /(2 \pi)=10 \mathrm{~Hz}f=ω/(2π)=10 Hz and separated by twice a = 2000 km a = 2000 km a=2000kma=2000 \mathrm{~km}a=2000 km and using I I ~ 4 M a 2 ω 2 I I ~ 4 M a 2 ω 2 I tilde(I)~~4Ma^(2)omega^(2)I \tilde{I} \approx 4 M a^{2} \omega^{2}II~4Ma2ω2 then
(46.46) | h ¯ | 2 G c 4 R 4 M a 2 ω 2 = 32 π 2 G M a 2 f 2 c 4 R 5 × 10 21 (46.46) | h ¯ | 2 G c 4 R 4 M a 2 ω 2 = 32 π 2 G M a 2 f 2 c 4 R 5 × 10 21 {:(46.46)| bar(h)|~~(2G)/(c^(4)R)4Ma^(2)omega^(2)=(32pi^(2)GMa^(2)f^(2))/(c^(4)R)~~5xx10^(-21):}\begin{equation*} |\bar{h}| \approx \frac{2 G}{c^{4} R} 4 M a^{2} \omega^{2}=\frac{32 \pi^{2} G M a^{2} f^{2}}{c^{4} R} \approx 5 \times 10^{-21} \tag{46.46} \end{equation*}(46.46)|h¯|2Gc4R4Ma2ω2=32π2GMa2f2c4R5×1021
This will be a small effect!
15 15 ^(15){ }^{15}15 After all, the energy-momentum of the gravitational field has no real meaning locally since you can always transform to a freely falling frame and gravity disappears.
The value of h ¯ 10 21 h ¯ 10 21 bar(h)~~10^(-21)\bar{h} \approx 10^{-21}h¯1021 is very small, and much tinier even than the Newtonian potential on the surface of the Earth, where | h 00 | = h 00 = |h_(00)|=\left|h_{00}\right|=|h00|= 2 G M / ( R c 2 ) 10 9 2 G M / R c 2 10 9 2GM_(o+)//(R_(o+)c^(2))~~10^(-9)2 G M_{\oplus} /\left(R_{\oplus} c^{2}\right) \approx 10^{-9}2GM/(Rc2)109, and this highlights an issue with our approach so far. The gravitational waves that are detected in experiments to date have frequencies in the tens of Hz to a few kHz range and consequently have wavelengths (tens to thousands of km ) which are short compared to a terrestrial scale. These ripples in spacetime are superimposed on a larger, more slowly varying background due to astrophysical objects (including the astrophysical object on which a gravitational detector might be mounted, i.e. the Earth!). We have described the waves using a metric g μ ν = η μ ν + h μ ν g μ ν = η μ ν + h μ ν g_(mu nu)=eta_(mu nu)+h_(mu nu)g_{\mu \nu}=\eta_{\mu \nu}+h_{\mu \nu}gμν=ημν+hμν, expanding around a flat spacetime, whereas we probably really should write something like
(46.47) g μ ν = g μ ν b + h μ ν (46.47) g μ ν = g μ ν b + h μ ν {:(46.47)g_(mu nu)=g_(mu nu)^(b)+h_(mu nu):}\begin{equation*} g_{\mu \nu}=g_{\mu \nu}^{\mathrm{b}}+h_{\mu \nu} \tag{46.47} \end{equation*}(46.47)gμν=gμνb+hμν
where g μ ν b g μ ν b g_(mu nu)^(b)g_{\mu \nu}^{\mathrm{b}}gμνb describes some background curved metric. However, even this may not be enough as it is not obvious which contributions should be background and which should be due to the gravitational waves. Moreover, as we will see, gravitational waves carry energy and momentum and so this is also going to act as a source of curvature of spacetime. Our linearized theory has ignored this effect, and so one way of including this is to say that our linearized Einstein tensor G μ ν ( 1 ) G μ ν ( 1 ) G_(mu nu)^((1))G_{\mu \nu}^{(1)}Gμν(1) is modified to
(46.48) G μ ν ( 1 ) = 8 π ( T μ ν + t μ ν ) (46.48) G μ ν ( 1 ) = 8 π T μ ν + t μ ν {:(46.48)G_(mu nu)^((1))=8pi(T_(mu nu)+t_(mu nu)):}\begin{equation*} G_{\mu \nu}^{(1)}=8 \pi\left(T_{\mu \nu}+t_{\mu \nu}\right) \tag{46.48} \end{equation*}(46.48)Gμν(1)=8π(Tμν+tμν)
where T μ ν T μ ν T_(mu nu)T_{\mu \nu}Tμν is due to matter and t μ ν t μ ν t_(mu nu)t_{\mu \nu}tμν is due to the effect of the gravitational field itself. Our exact Einstein equation is
(46.49) G μ ν = 8 π T μ ν (46.49) G μ ν = 8 π T μ ν {:(46.49)G_(mu nu)=8piT_(mu nu):}\begin{equation*} G_{\mu \nu}=8 \pi T_{\mu \nu} \tag{46.49} \end{equation*}(46.49)Gμν=8πTμν
where the exact Einstein tensor is written as
(46.50) G μ ν G μ ν ( 1 ) + G μ ν ( 2 ) + , (46.50) G μ ν G μ ν ( 1 ) + G μ ν ( 2 ) + , {:(46.50)G_(mu nu)-=G_(mu nu)^((1))+G_(mu nu)^((2))+cdots",":}\begin{equation*} G_{\mu \nu} \equiv G_{\mu \nu}^{(1)}+G_{\mu \nu}^{(2)}+\cdots, \tag{46.50} \end{equation*}(46.50)GμνGμν(1)+Gμν(2)+,
where the sum is over a linear approximation G μ ν ( 1 ) G μ ν ( 1 ) G_(mu nu)^((1))G_{\mu \nu}^{(1)}Gμν(1) (linear in h μ ν h μ ν h_(mu nu)h_{\mu \nu}hμν ), a quadratic approximation G μ ν ( 2 ) G μ ν ( 2 ) G_(mu nu)^((2))G_{\mu \nu}^{(2)}Gμν(2) (quadratic in h μ ν h μ ν h_(mu nu)h_{\mu \nu}hμν ), etc. If we truncate the series at the quadratic term, then these equations suggest that
(46.51) t μ ν = 1 8 π G μ ν ( 2 ) (46.51) t μ ν = 1 8 π G μ ν ( 2 ) {:(46.51)t_(mu nu)=-(1)/(8pi)G_(mu nu)^((2)):}\begin{equation*} t_{\mu \nu}=-\frac{1}{8 \pi} G_{\mu \nu}^{(2)} \tag{46.51} \end{equation*}(46.51)tμν=18πGμν(2)
It turns out that this quantity is not gauge invariant, and you probably wouldn't expect it to be. 15 15 ^(15){ }^{15}15 The trick is to average this quantity over a region of spacetime that is spatially larger than the wavelength of a gravitational wave and temporally larger than the reciprocal a gravitational-wave frequency, thereby capturing the curvature of the background spacetime. Thus, we arrive at
(46.52) t μ ν = c 4 8 π G G μ ν ( 2 ) , (46.52) t μ ν = c 4 8 π G G μ ν ( 2 ) , {:(46.52)t_(mu nu)=-(c^(4))/(8pi G)(:G_(mu nu)^((2)):)",":}\begin{equation*} t_{\mu \nu}=-\frac{c^{4}}{8 \pi G}\left\langle G_{\mu \nu}^{(2)}\right\rangle, \tag{46.52} \end{equation*}(46.52)tμν=c48πGGμν(2),
where the angle brackets denote this averaging process and we have restored the factors of G G GGG and c c ccc.

46.4 Radiated energy and power

The evaluation of an expression for t μ ν t μ ν t_(mu nu)t_{\mu \nu}tμν from eqn 46.52 is rather tedious, but in the transverse-traceless gauge it produces 16 16 ^(16){ }^{16}16 the rather pleasing result
(46.53) t μ ν = c 4 32 π G μ h α β TT ν h TT α β (46.53) t μ ν = c 4 32 π G μ h α β TT ν h TT α β {:(46.53)t_(mu nu)=(c^(4))/(32 pi G)(:del_(mu)h_(alpha beta)^(TT)del_(nu)h_(TT)^(alpha beta):):}\begin{equation*} t_{\mu \nu}=\frac{c^{4}}{32 \pi G}\left\langle\partial_{\mu} h_{\alpha \beta}^{\mathrm{TT}} \partial_{\nu} h_{\mathrm{TT}}^{\alpha \beta}\right\rangle \tag{46.53} \end{equation*}(46.53)tμν=c432πGμhαβTTνhTTαβ
In particular, the energy density is then given by 17 17 ^(17){ }^{17}17
(46.54) t 00 = c 4 32 π G 0 h α β TT 0 h TT α β = G 8 π c 6 R 2 I i j TT I i j TT (46.54) t 00 = c 4 32 π G 0 h α β TT 0 h TT α β = G 8 π c 6 R 2 I i j TT I i j TT {:(46.54)t_(00)=(c^(4))/(32 pi G)(:del_(0)h_(alpha beta)^(TT)del_(0)h_(TT)^(alpha beta):)=(G)/(8pic^(6)R^(2))(:I^(⃛)_(ij)^(TT)I^(⃛)_(ij)^(TT):):}\begin{equation*} t_{00}=\frac{c^{4}}{32 \pi G}\left\langle\partial_{0} h_{\alpha \beta}^{\mathrm{TT}} \partial_{0} h_{\mathrm{TT}}^{\alpha \beta}\right\rangle=\frac{G}{8 \pi c^{6} R^{2}}\left\langle\dddot{I}_{i j}^{\mathrm{TT}} \dddot{I}_{i j}^{\mathrm{TT}}\right\rangle \tag{46.54} \end{equation*}(46.54)t00=c432πG0hαβTT0hTTαβ=G8πc6R2IijTTIijTT
where we have used the Einstein quadrupole formula, eqn 46.44. We can work out the total flux of power from our source, namely the energy passing per second through a spherical surface of radius R R RRR using the fact the energy inside a volume V V VVV is d 3 x T 00 d 3 x T 00 intd^(3)xT^(00)\int \mathrm{d}^{3} x T^{00}d3xT00, so that the rate of change of energy is
(46.55) d d t d 3 x T 00 = d 3 x t T 00 = c d 3 x i T 0 i (46.55) d d t d 3 x T 00 = d 3 x t T 00 = c d 3 x i T 0 i {:(46.55)(d)/((d)t)intd^(3)xT^(00)=intd^(3)xdel_(t)T^(00)=-c intd^(3)xdel_(i)T^(0i):}\begin{equation*} \frac{\mathrm{d}}{\mathrm{~d} t} \int \mathrm{~d}^{3} x T^{00}=\int \mathrm{d}^{3} x \partial_{t} T^{00}=-c \int \mathrm{~d}^{3} x \partial_{i} T^{0 i} \tag{46.55} \end{equation*}(46.55)d dt d3xT00=d3xtT00=c d3xiT0i
where we have used μ T μ ν = 0 μ T μ ν = 0 del_(mu)T^(mu nu)=0\partial_{\mu} T^{\mu \nu}=0μTμν=0 and t = c 0 t = c 0 del_(t)=cdel_(0)\partial_{t}=c \partial_{0}t=c0. The energy carried by the gravitational wave is the negative of this (because energy lost inside the volume V V VVV is carried away by the gravitational waves), so together with Stokes' theorem we have the rate of change of energy emitted by quadrupolar waves as an integral over the surface S S SSS and hence
(46.56) d E d t = c S T o i d Σ i = c R 2 T 0 r d Ω (46.56) d E d t = c S T o i d Σ i = c R 2 T 0 r d Ω {:(46.56)(dE)/((d)t)=cint_(S)T^(oi)dSigma_(i)=cR^(2)intT^(0r)dOmega:}\begin{equation*} \frac{\mathrm{d} E}{\mathrm{~d} t}=c \int_{S} T^{o i} \mathrm{~d} \Sigma_{i}=c R^{2} \int T^{0 r} \mathrm{~d} \Omega \tag{46.56} \end{equation*}(46.56)dE dt=cSToi dΣi=cR2T0r dΩ
where the final result is an integral over solid angle (inside a sphere of radius R R RRR ). In the present case, we are working in terms of t μ ν t μ ν t_(mu nu)t_{\mu \nu}tμν due to the effect of the gravitational field, and therefore we need t 0 r t 0 r t^(0r)t^{0 r}t0r in eqn 46.56. We can get this from eqn 46.53 , but we also can use the fact that h ¯ i j h ¯ i j bar(h)_(ij)\bar{h}_{i j}h¯ij is a function of t R / c t R / c t-R//ct-R / ctR/c and hence
(46.57) r h ¯ i j = 0 h ¯ i j = + 0 h ¯ i j (46.57) r h ¯ i j = 0 h ¯ i j = + 0 h ¯ i j {:(46.57)del_(r) bar(h)_(ij)=-del_(0) bar(h)_(ij)=+del^(0) bar(h)_(ij):}\begin{equation*} \partial_{r} \bar{h}_{i j}=-\partial_{0} \bar{h}_{i j}=+\partial^{0} \bar{h}_{i j} \tag{46.57} \end{equation*}(46.57)rh¯ij=0h¯ij=+0h¯ij
which means we can write t 0 r = t 00 t 0 r = t 00 t^(0r)=t^(00)t^{0 r}=t^{00}t0r=t00. Thus, the rate of change of energy carried away by gravitational waves from their source is
(46.58) d E d t = c R 2 Ω t 00 d Ω = G 8 π c 5 Ω I i j TT I i j TT d Ω . (46.58) d E d t = c R 2 Ω t 00 d Ω = G 8 π c 5 Ω I i j TT I i j TT d Ω . {:(46.58)(dE)/((d)t)=cR^(2)int_(Omega)t^(00)dOmega=(G)/(8pic^(5))int_(Omega)(:I^(⃛)_(ij)^(TT)I^(⃛)_(ij)^(TT):)dOmega.:}\begin{equation*} \frac{\mathrm{d} E}{\mathrm{~d} t}=c R^{2} \int_{\Omega} t^{00} \mathrm{~d} \Omega=\frac{G}{8 \pi c^{5}} \int_{\Omega}\left\langle\dddot{I}_{i j}^{\mathrm{TT}} \dddot{I}_{i j}^{\mathrm{TT}}\right\rangle \mathrm{d} \Omega . \tag{46.58} \end{equation*}(46.58)dE dt=cR2Ωt00 dΩ=G8πc5ΩIijTTIijTTdΩ.
To evaluate this, we can use the identity 18 18 ^(18){ }^{18}18
(46.59) M i j TT M i j TT = M i j M i j 2 n a n b M i a M j b + 1 2 n a n b n c n d M a b M c d (46.59) M i j TT M i j TT = M i j M i j 2 n a n b M i a M j b + 1 2 n a n b n c n d M a b M c d {:(46.59)M_(ij)^(TT)M_(ij)^(TT)=M_(ij)M_(ij)-2n_(a)n_(b)M_(ia)M_(jb)+(1)/(2)n_(a)n_(b)n_(c)n_(d)M_(ab)M_(cd):}\begin{equation*} M_{i j}^{\mathrm{TT}} M_{i j}^{\mathrm{TT}}=M_{i j} M_{i j}-2 n_{a} n_{b} M_{i a} M_{j b}+\frac{1}{2} n_{a} n_{b} n_{c} n_{d} M_{a b} M_{c d} \tag{46.59} \end{equation*}(46.59)MijTTMijTT=MijMij2nanbMiaMjb+12nanbncndMabMcd
The integral evaluates 19 19 ^(19){ }^{19}19 to
(46.60) d E d t = G 5 c 5 I i j I i j (46.60) d E d t = G 5 c 5 I i j I i j {:(46.60)(dE)/((d)t)=(G)/(5c^(5))(:I^(⃛)_(ij)I^(⃛)_(ij):):}\begin{equation*} \frac{\mathrm{d} E}{\mathrm{~d} t}=\frac{G}{5 c^{5}}\left\langle\dddot{I}_{i j} \dddot{I}_{i j}\right\rangle \tag{46.60} \end{equation*}(46.60)dE dt=G5c5IijIij
The prefactor G / ( 5 c 5 ) G / 5 c 5 G//(5c^(5))G /\left(5 c^{5}\right)G/(5c5) is extremely small and hence most potential sources of gravitational radiation produce only a very weak emitted power. 20 20 ^(20){ }^{20}20
16 16 ^(16){ }^{16}16 The calculation is rather tedious, but it is laid out in detail in Exercise 46.2 for any reader who wishes to follow it through.
17 17 ^(17){ }^{17}17 When we are just focussing on spatial vectors, the distinction between upstairs and downstairs indices becomes unnecessary, and following the lead of most authors in this field, we will write I i j TT I i j TT I i j TT I i j TT (:I^(⃛)_(ij)^(TT)I^(⃛)_(ij)^(TT):)\left\langle\dddot{I}_{i j}^{\mathrm{TT}} \dddot{I}_{i j}^{\mathrm{TT}}\right\rangleIijTTIijTT rather than I i j TT I TT i j I i j TT I TT i j (:I^(⃛)_(ij)^(TT)I^(⃛)_(TT)^(ij):)\left\langle\dddot{I}_{i j}^{\mathrm{TT}} \dddot{I}_{\mathrm{TT}}^{i j}\right\rangleIijTTITTij, which is less fussy. The Einstein summation convention still holds, so i i iii and j j jjj are summed over.
18 18 ^(18){ }^{18}18 See Exercise 46.5(d).
19 19 ^(19){ }^{19}19 See Exercise 46.5(e). Note that this is the energy flux associated with the gravitational waves. The rate of change of energy inside the volume is minus this.
20 20 ^(20){ }^{20}20 Moreover, note that a spherically symmetric source (such as rotating star) has zero I i j I i j I i j I i j (:I_(ij)I_(ij):)\left\langle I_{i j} I_{i j}\right\rangleIijIij (its moment of inertia does not change with time) and will not emit gravitational waves.
21 21 ^(21){ }^{21}21 As shown in Exercise 46.4, the emitted power for most orbiting systems is tiny. To get some sizeable, and hence potentially detectable, emitted power, we need some pretty dramatic situations, such as the two closely spaced black holes whirling around each at ferocious speed considered in this example. We could have chosen two neutron stars, but we wanted to be even tron stars, but we wanted to even
more dramatic! Note that L ω 6 L ω 6 L propomega^(6)L \propto \omega^{6}Lω6 more dramatic! Note that L ω 6 L ω 6 L propomega^(6)L \propto \omega^{6}Lω6
so high frequency signals contain much so high freque
more power.
22 22 ^(22){ }^{22}22 The derivation is in Maggiore, Sec tion 3.3.3. Note that this expression for the angular momentum implies that angular momentum will not be emitted if the source is axisymmetric.

Example 46.10

Returning to the numbers 21 21 ^(21){ }^{21}21 in Example 46.9, we would estimate (using I i j I i j = I i j I i j = (:I^(⃛)_(ij)I^(⃛)_(ij):)=\left\langle\dddot{I}_{i j} \dddot{I}_{i j}\right\rangle=IijIij= 128 M 2 a 4 ω 6 128 M 2 a 4 ω 6 128M^(2)a^(4)omega^(6)128 M^{2} a^{4} \omega^{6}128M2a4ω6, as shown in eqn 46.106 from Exercise 46.6) that the gravitational luminosity L L LLL of our binary black-hole system would be
(46.61) L = 128 G 5 c 5 M 2 a 4 ω 6 10 47 W (46.61) L = 128 G 5 c 5 M 2 a 4 ω 6 10 47 W {:(46.61)L=(128 G)/(5c^(5))M^(2)a^(4)omega^(6)~~10^(47)W:}\begin{equation*} L=\frac{128 G}{5 c^{5}} M^{2} a^{4} \omega^{6} \approx 10^{47} \mathrm{~W} \tag{46.61} \end{equation*}(46.61)L=128G5c5M2a4ω61047 W
although the power flux on Earth works out to be less than 1 mW m 2 1 mW m 2 1mWm^(-2)1 \mathrm{~mW} \mathrm{~m}^{-2}1 mW m2.
One can also show that not only is energy carried away by gravitational waves but also angular momentum. By analogy with eqn 46.56 , the rate of change of angular momentum is given by
(46.62) d J i d t = S ϵ i j k x j T k m d Σ m (46.62) d J i d t = S ϵ i j k x j T k m d Σ m {:(46.62)(dJ_(i))/((d)t)=-int_(S)epsilon_(ijk)x^(j)T^(km)dSigma_(m):}\begin{equation*} \frac{\mathrm{d} J_{i}}{\mathrm{~d} t}=-\int_{S} \epsilon_{i j k} x^{j} T^{k m} \mathrm{~d} \Sigma_{m} \tag{46.62} \end{equation*}(46.62)dJi dt=SϵijkxjTkm dΣm
and this can be used to show that the rate of change of angular momentum carried by the waves from the source is 22 22 ^(22){ }^{22}22
(46.63) d J i d t = 2 G 5 c 5 ϵ i j k I ¨ j I k (46.63) d J i d t = 2 G 5 c 5 ϵ i j k I ¨ j I k {:(46.63)(dJ^(i))/((d)t)=(2G)/(5c^(5))epsilon_(ijk)(:I^(¨)_(jℓ)I^(⃛)_(kℓ):):}\begin{equation*} \frac{\mathrm{d} J^{i}}{\mathrm{~d} t}=\frac{2 G}{5 c^{5}} \epsilon_{i j k}\left\langle\ddot{I}_{j \ell} \dddot{I}_{k \ell}\right\rangle \tag{46.63} \end{equation*}(46.63)dJi dt=2G5c5ϵijkI¨jIk
For both eqn 46.60 and eqn 46.63 the quantities I i j I i j I^(⃛)_(ij)\dddot{I}_{i j}Iij are evaluated at the retarded time t r = t R / c t r = t R / c t_(r)=t-R//ct_{\mathrm{r}}=t-R / ctr=tR/c.

Example 46.11

An orbiting pair of black holes has a gravitational-wave luminosity L L LLL given by eqn 46.61 so that L = 128 G 5 c 5 M 2 a 4 ω 6 L = 128 G 5 c 5 M 2 a 4 ω 6 L=(128 G)/(5c^(5))M^(2)a^(4)omega^(6)L=\frac{128 G}{5 c^{5}} M^{2} a^{4} \omega^{6}L=128G5c5M2a4ω6, but the two objects (assuming Newtonian mechanics holds) have a gravitational potential energy equal to G M 2 / ( 2 a ) G M 2 / ( 2 a ) -GM^(2)//(2a)-G M^{2} /(2 a)GM2/(2a) and a kinetic energy of 2 × 1 2 M v 2 = G M 2 / ( 4 a ) 2 × 1 2 M v 2 = G M 2 / ( 4 a ) 2xx(1)/(2)Mv^(2)=GM^(2)//(4a)2 \times \frac{1}{2} M v^{2}=G M^{2} /(4 a)2×12Mv2=GM2/(4a) so that the total energy is E = G M 2 / ( 4 a ) = E = G M 2 / ( 4 a ) = E=-GM^(2)//(4a)=E=-G M^{2} /(4 a)=E=GM2/(4a)= M a 2 ω 2 M a 2 ω 2 -Ma^(2)omega^(2)-M a^{2} \omega^{2}Ma2ω2 and
(46.64) ω = G M 4 a 3 (46.64) ω = G M 4 a 3 {:(46.64)omega=sqrt((GM)/(4a^(3))):}\begin{equation*} \omega=\sqrt{\frac{G M}{4 a^{3}}} \tag{46.64} \end{equation*}(46.64)ω=GM4a3
This expression for ω ω omega\omegaω allows us to rearrange eqn 46.61 to give
(46.65) L = d E d t = 128 G 5 c 5 M 2 a 4 ω 6 = 2 5 G 4 M 5 a 5 c 5 . (46.65) L = d E d t = 128 G 5 c 5 M 2 a 4 ω 6 = 2 5 G 4 M 5 a 5 c 5 . {:(46.65)L=(dE)/((d)t)=(128 G)/(5c^(5))M^(2)a^(4)omega^(6)=(2)/(5)(G^(4)M^(5))/(a^(5)c^(5)).:}\begin{equation*} L=\frac{\mathrm{d} E}{\mathrm{~d} t}=\frac{128 G}{5 c^{5}} M^{2} a^{4} \omega^{6}=\frac{2}{5} \frac{G^{4} M^{5}}{a^{5} c^{5}} . \tag{46.65} \end{equation*}(46.65)L=dE dt=128G5c5M2a4ω6=25G4M5a5c5.
It is useful to express E E EEE in terms of the orbital period P = 2 π / ω P = 2 π / ω P=2pi//omegaP=2 \pi / \omegaP=2π/ω, and some rearrangement gives
(46.66) E = ( G 2 M 5 π 2 4 ) 1 3 P 2 3 (46.66) E = G 2 M 5 π 2 4 1 3 P 2 3 {:(46.66)E=-((G^(2)M^(5)pi^(2))/(4))^((1)/(3))P^(-(2)/(3)):}\begin{equation*} E=-\left(\frac{G^{2} M^{5} \pi^{2}}{4}\right)^{\frac{1}{3}} P^{-\frac{2}{3}} \tag{46.66} \end{equation*}(46.66)E=(G2M5π24)13P23
and hence the rate of change of orbital period is
(46.67) d P d t = 3 2 P E d E d t (46.67) d P d t = 3 2 P E d E d t {:(46.67)(dP)/((d)t)=-(3)/(2)*(P)/(E)((d)E)/((d)t):}\begin{equation*} \frac{\mathrm{d} P}{\mathrm{~d} t}=-\frac{3}{2} \cdot \frac{P}{E} \frac{\mathrm{~d} E}{\mathrm{~d} t} \tag{46.67} \end{equation*}(46.67)dP dt=32PE dE dt
so that using eqn 46.65 we have
(46.68) d P d t = 3 5 4 11 3 π 8 3 ( G M P ) 5 3 (46.68) d P d t = 3 5 4 11 3 π 8 3 G M P 5 3 {:(46.68)(dP)/((d)t)=-(3)/(5)4^((11)/(3))pi^((8)/(3))((GM)/(P))^((5)/(3)):}\begin{equation*} \frac{\mathrm{d} P}{\mathrm{~d} t}=-\frac{3}{5} 4^{\frac{11}{3}} \pi^{\frac{8}{3}}\left(\frac{G M}{P}\right)^{\frac{5}{3}} \tag{46.68} \end{equation*}(46.68)dP dt=354113π83(GMP)53
Note that the two orbiting black holes are in a bound state so that the total energy E E EEE is negative. Therefore, when energy is lost via emission of gravitational waves the energy becomes more negative and hence | E | | E | |E||E||E| is larger. This causes the orbital period P P PPP to decrease (hence the minus sign in eqn 46.68) and the orbiting becomes faster. This speed up of the orbital motion (called spin-up) was observed in a binary pulsar system in 1974 by Hulse and Taylor and was the first (indirect) discovery of gravitational waves (see Fig. 46.3). We receive radio emission only from one of the poppler shift of the radio perions us to estimate the athe the 775 hour and to measue that obit is very slowly seeding up, with a decrease of orbital period 76.5 microseconds per year. As | E | | E | |E||E||E| increases the two black holes get closer together and
解 ravitational waves. The only non-zero component is perpendicular to the by the plane and yields 23 23 ^(23){ }^{23}23. The only non-zero component is perpendicular to the orbital
(46.69) d J z d t = 128 G M 2 a 4 ω 5 5 c 5 = 4 G 7 2 M 9 2 5 a 7 2 c 5 (46.69) d J z d t = 128 G M 2 a 4 ω 5 5 c 5 = 4 G 7 2 M 9 2 5 a 7 2 c 5 {:(46.69)(dJ^(z))/((d)t)=(128 GM^(2)a^(4)omega^(5))/(5c^(5))=(4G^((7)/(2))M^((9)/(2)))/(5a^((7)/(2))c^(5)):}\begin{equation*} \frac{\mathrm{d} J^{z}}{\mathrm{~d} t}=\frac{128 G M^{2} a^{4} \omega^{5}}{5 c^{5}}=\frac{4 G^{\frac{7}{2}} M^{\frac{9}{2}}}{5 a^{\frac{7}{2}} c^{5}} \tag{46.69} \end{equation*}(46.69)dJz dt=128GM2a4ω55c5=4G72M925a72c5
The angular momentum of our orbiting pair of black holes is J = 2 M a 2 ω = J = 2 M a 2 ω = J=2Ma^(2)omega=J=2 M a^{2} \omega=J=2Ma2ω= G M a 3 = E ω / 2 G M a 3 = E ω / 2 sqrt(GMa^(3))=-E omega//2\sqrt{G M a^{3}}=-E \omega / 2GMa3=Eω/2. Also, eliminating a a aaa and ω ω omega\omegaω we have E = G 2 M 5 / ( 4 J 2 ) E = G 2 M 5 / 4 J 2 E=-G^(2)M^(5)//(4J^(2))E=-G^{2} M^{5} /\left(4 J^{2}\right)E=G2M5/(4J2) and hence d J / d t = ( J / 2 E ) d E / d t d J / d t = ( J / 2 E ) d E / d t dJ//dt=-(J//2E)dE//dt\mathrm{d} J / \mathrm{d} t=-(J / 2 E) \mathrm{d} E / \mathrm{d} tdJ/dt=(J/2E)dE/dt and therefore substituting in our expression for d E / d t d E / d t dE//dt\mathrm{d} E / \mathrm{d} tdE/dt gives
(46.70) d J d t = 4 G 7 2 M 9 2 5 a 7 2 c 5 (46.70) d J d t = 4 G 7 2 M 9 2 5 a 7 2 c 5 {:(46.70)(dJ)/((d)t)=-(4G^((7)/(2))M^((9)/(2)))/(5a^((7)/(2))c^(5)):}\begin{equation*} \frac{\mathrm{d} J}{\mathrm{~d} t}=-\frac{4 G^{\frac{7}{2}} M^{\frac{9}{2}}}{5 a^{\frac{7}{2}} c^{5}} \tag{46.70} \end{equation*}(46.70)dJ dt=4G72M925a72c5
in agreement with eqn 46.69 (the sign change being that the angular momentum lost in agreement with eqn 46.69 (the sign change being that the ang orbiting pair is carried off by the gravitational waves).
by the or
Our treatment of these astrophysical cases has assumed Newtonian dynamics and so therefore we would not expect it to apply in the final case of the inspiral of two compact objects as their orbital periods drop rapidly and their orbital velocities become relativistic. Our approach has also assumed a linearized approximation to general relativity and here we will encounter a big difference with electromagnetism. In the electromagnetic case, the force-mediating particle (the photon) and its associated wave (the electromagnetic wave) have no charge and so there is no nonlinear effect in free space. For gravity, our gravitational waves do transmit energy (i.e. mass) which is itself a source of gravity; thus, our theory is inherently nonlinear and we take a brief moment in the next section to examine the consequences of this.

46.5 An exact solution

So far we have only examined gravitational waves within the linear approximation. Will they survive in the exact, nonlinear theory of gravitation? We can show that they shall.
We argued in Section 46.2 for solutions as a function of ϕ = ( ω t + ϕ = ( ω t + phi=(-omega t+\phi=(-\omega t+ϕ=(ωt+ k z ) k z ) kz)k z)kz) with ω = k z = | k | ω = k z = | k | omega=k^(z)=|k|\omega=k^{z}=|k|ω=kz=|k|, since gravitational waves are described by null velocity vectors. Since we expect gravitational waves to be null, it is useful to work in light-cone coordinates u = t z u = t z u=t-zu=t-zu=tz and v = t + z v = t + z v=t+zv=t+zv=t+z in which the Minkowski metric is written in terms of the line element as
(46.71) d s 2 = η μ ν d x μ d x ν = d u d v + d x 2 + d y 2 (46.71) d s 2 = η μ ν d x μ d x ν = d u d v + d x 2 + d y 2 {:(46.71)ds^(2)=eta_(mu nu)dx^(mu)dx^(nu)=-dudv+dx^(2)+dy^(2):}\begin{equation*} \mathrm{d} s^{2}=\eta_{\mu \nu} \mathrm{d} x^{\mu} \mathrm{d} x^{\nu}=-\mathrm{d} u \mathrm{~d} v+\mathrm{d} x^{2}+\mathrm{d} y^{2} \tag{46.71} \end{equation*}(46.71)ds2=ημνdxμdxν=du dv+dx2+dy2
Fig. 46.3 The orbital decay of the Hulse-Taylor binary pulsar system PSR B1913+16, together with the prediction from general relativity Our treatment has assumed circular orbits of two equal masses, but the curve here assumes the correct elliptical orbital parameters. This plot shows cumulative shifts in the time of periastron. [Figure reproduced from J. M. Weisberg and Y. Huang, Astrophysical Journal 829, 55 (2016)
doi:10.3847/0004-637X/829/1/55 doi:10.3847/0004-637X/829/1/55 (C)American Astronomical Society.] 23 23 ^(23){ }^{23}23 See Exercise 46.7.
We can therefore look for an exact, wavelike solution to the Einstein field equation with metric
(46.72) d s 2 = ( η μ ν + h μ ν ) d x μ d x ν = d u d v + F ( u ) 2 d x 2 + G ( u ) 2 d y 2 , (46.72) d s 2 = η μ ν + h μ ν d x μ d x ν = d u d v + F ( u ) 2 d x 2 + G ( u ) 2 d y 2 , {:(46.72)ds^(2)=(eta_(mu nu)+h_(mu nu))dx^(mu)dx^(nu)=-dudv+F(u)^(2)dx^(2)+G(u)^(2)dy^(2)",":}\begin{equation*} \mathrm{d} s^{2}=\left(\eta_{\mu \nu}+h_{\mu \nu}\right) \mathrm{d} x^{\mu} \mathrm{d} x^{\nu}=-\mathrm{d} u \mathrm{~d} v+F(u)^{2} \mathrm{~d} x^{2}+G(u)^{2} \mathrm{~d} y^{2}, \tag{46.72} \end{equation*}(46.72)ds2=(ημν+hμν)dxμdxν=du dv+F(u)2 dx2+G(u)2 dy2,
where F F FFF and G G GGG are functions to be determined by the Einstein equation.
Example 46.12
Using the metric from this line element, we can generate the useful connections and Riemann components. These are
(46.73) Γ x u x = 1 F F u , Γ y u y = 1 G G u , Γ x x v = 2 F F u , Γ y y v = 2 G G u (46.74) R u x u x = 1 F 2 F u 2 , R u y u y = 1 G 2 G u 2 . (46.73) Γ x u x = 1 F F u , Γ y u y = 1 G G u , Γ x x v = 2 F F u , Γ y y v = 2 G G u (46.74) R u x u x = 1 F 2 F u 2 , R u y u y = 1 G 2 G u 2 . {:[(46.73)Gamma_(xu)^(x)=(1)/(F)(del F)/(del u)","quadGamma_(yu)^(y)=(1)/(G)(del G)/(del u)","quadGamma_(xx)^(v)=(2)/(F)(del F)/(del u)","quadGamma_(yy)^(v)=(2)/(G)(del G)/(del u)],[(46.74)R_(uxu)^(x)=-(1)/(F)(del^(2)F)/(delu^(2))","quadR_(uyu)^(y)=-(1)/(G)(del^(2)G)/(delu^(2)).]:}\begin{gather*} \Gamma_{x u}^{x}=\frac{1}{F} \frac{\partial F}{\partial u}, \quad \Gamma_{y u}^{y}=\frac{1}{G} \frac{\partial G}{\partial u}, \quad \Gamma_{x x}^{v}=\frac{2}{F} \frac{\partial F}{\partial u}, \quad \Gamma_{y y}^{v}=\frac{2}{G} \frac{\partial G}{\partial u} \tag{46.73}\\ R_{u x u}^{x}=-\frac{1}{F} \frac{\partial^{2} F}{\partial u^{2}}, \quad R_{u y u}^{y}=-\frac{1}{G} \frac{\partial^{2} G}{\partial u^{2}} . \tag{46.74} \end{gather*}(46.73)Γxux=1FFu,Γyuy=1GGu,Γxxv=2FFu,Γyyv=2GGu(46.74)Ruxux=1F2Fu2,Ruyuy=1G2Gu2.
and
We therefore have the Einstein equation (for a vacuum) as
(46.75) 1 F 2 F u 2 + 1 G 2 G u 2 = 0 (46.75) 1 F 2 F u 2 + 1 G 2 G u 2 = 0 {:(46.75)(1)/(F)(del^(2)F)/(delu^(2))+(1)/(G)(del^(2)G)/(delu^(2))=0:}\begin{equation*} \frac{1}{F} \frac{\partial^{2} F}{\partial u^{2}}+\frac{1}{G} \frac{\partial^{2} G}{\partial u^{2}}=0 \tag{46.75} \end{equation*}(46.75)1F2Fu2+1G2Gu2=0
If F ( u ) = 1 + ε ( u ) F ( u ) = 1 + ε ( u ) F(u)=1+epsi(u)F(u)=1+\varepsilon(u)F(u)=1+ε(u), where the function ε ( u ) ε ( u ) epsi(u)\varepsilon(u)ε(u) is assumed small, then eqn 46.75 can be solved by G ( u ) = 1 ε ( u ) G ( u ) = 1 ε ( u ) G(u)=1-epsi(u)G(u)=1-\varepsilon(u)G(u)=1ε(u), since we obtain
ε ( u ) 1 + ε ( u ) ε ( u ) 1 ε ( u ) ε ( u ) [ 1 ε ( u ) + ] ε ( u ) [ 1 + ε ( u ) + ] 0 ε ( u ) 1 + ε ( u ) ε ( u ) 1 ε ( u ) ε ( u ) [ 1 ε ( u ) + ] ε ( u ) [ 1 + ε ( u ) + ] 0 (epsi^('')(u))/(1+epsi(u))-(epsi^('')(u))/(1-epsi(u))~~epsi^('')(u)[1-epsi(u)+dots]-epsi^('')(u)[1+epsi(u)+dots]~~0\frac{\varepsilon^{\prime \prime}(u)}{1+\varepsilon(u)}-\frac{\varepsilon^{\prime \prime}(u)}{1-\varepsilon(u)} \approx \varepsilon^{\prime \prime}(u)[1-\varepsilon(u)+\ldots]-\varepsilon^{\prime \prime}(u)[1+\varepsilon(u)+\ldots] \approx 0ε(u)1+ε(u)ε(u)1ε(u)ε(u)[1ε(u)+]ε(u)[1+ε(u)+]0
Since, in this case we have h x x = ε ( u ) h x x = ε ( u ) h_(xx)=epsi(u)h_{x x}=\varepsilon(u)hxx=ε(u) and h y y = ε ( u ) h y y = ε ( u ) h_(yy)=-epsi(u)h_{y y}=-\varepsilon(u)hyy=ε(u), this is just the case shown in Fig. 46.1(b) of a + polarized wave.
We conclude that the exact, nonlinear case therefore supports solutions similar to the ones found for the linear, weak-field theory.

46.6 The discovery of gravitational waves

Gravitational waves were for a hundred years, apart from the impressive but indirect evidence from the spin-up of the Hulse-Taylor binary pulsar, a theoretical construct. It was believed that they were there, but they had not been directly detected. That has all changed due to the extraordinary results that have been obtained from ground-based gravitational-wave observatories. These are designed to probe the relatively high-frequency portion of the gravitational-wave spectrum, from about 10 Hz to about 10 kHz . This spectrum is dominated by signals originating from stellar-mass compact sources, principally coalescing binary black hole and neutron star systems.
In ground-based observatories, the idea is to use the alternating motion of masses produced by a passing gravitational wave that we can detect using laser interferometry. The key here is to use the sensitivity of interference to optical path length to detect the oscillatory motion of the masses. We've seen how gravitational waves lead to quadrupolar oscillations of masses. To see the + polarization for a wave propagating in the z z zzz-direction, for example, we would need to access the relative motion
of at least two masses, such as one separated along x x xxx and one separated along y y yyy. In order to do this, we use the masses to form a Michelson 24 24 ^(24){ }^{24}24 interferometer in the x y x y x-yx-yxy plane, as shown in Fig. 46.4. Laser light is split by a beam-splitting mirror and travels along the two arms of the interferometer. The lengths of the arms are defined by mirrored masses, which reflect the light back to where the beams are recombined and detected via a photodetector as shown. The bad news is that the effect is small. Even for merging neutron stars or collapsing supernovae, we only expect fractional changes in the displacement of the mirrors in the experiment of 10 21 10 21 10^(-21)10^{-21}1021. In order to see an oscillation from a gravitational wave, a typical photon must remain in the system for at least half the period of the gravitational wave, which turns out to be of order milliseconds. As a result, we require interferometers with very long arms and exceedingly sensitive detection technologies. This is far from trivial. It took several decades to develop the technology to achieve this extraordinary level of sensitivity.
Example 46.13
The intensity at the photodetector is determined by the phase shift between the light combined from each arm of the interferometer. Take the arms to have lengths L x L x L_(x)L_{x}Lx and L y L y L_(y)L_{y}Ly. If the arm lengths vary by respective amounts δ x δ x delta x\delta xδx and δ y δ y delta y\delta yδy then the phase shift is Δ ϕ = ω 0 ( 2 δ y 2 δ x ) Δ ϕ = ω 0 ( 2 δ y 2 δ x ) Delta phi=omega_(0)(2delta y-2delta x)\Delta \phi=\omega_{0}(2 \delta y-2 \delta x)Δϕ=ω0(2δy2δx), where ω 0 ω 0 omega_(0)\omega_{0}ω0 is the laser frequency. Using eqn 46.33 we can rewrite this as
(46.77) Δ ϕ ( t ) = ( L x + L y ) h + ( t ) (46.77) Δ ϕ ( t ) = L x + L y h + ( t ) {:(46.77)Delta phi(t)=(L_(x)+L_(y))h_(+)(t):}\begin{equation*} \Delta \phi(t)=\left(L_{x}+L_{y}\right) h_{+}(t) \tag{46.77} \end{equation*}(46.77)Δϕ(t)=(Lx+Ly)h+(t)
allowing us to see the relationship between the h h h\boldsymbol{h}h field and the phase shift. Assuming the arms are roughly the same length L L LLL we have that the intensity measured is
(46.78) I Δ ϕ ( t ) 2 L h + ( t ) (46.78) I Δ ϕ ( t ) 2 L h + ( t ) {:(46.78)I prop Delta phi(t)~~2Lh_(+)(t):}\begin{equation*} I \propto \Delta \phi(t) \approx 2 L h_{+}(t) \tag{46.78} \end{equation*}(46.78)IΔϕ(t)2Lh+(t)
We conclude that the length of the arms must be maximized to have a chance of seeing the effect.
LIGO stands for the Laser Interferometry Gravitational Wave Observatory. It comprises two identical detectors: one interferometer in Livingston, Louisiana and one in Hanford, Washington. The arms of the interferometers are 4.2 km in length (for comparison, the MichelsonMorley experiment involved arms of length 1.3 m ). However, these still cannot introduce a long-enough optical path difference to detect the oscillations from the tiny displacements caused by the perturbations to the metric caused by gravitational waves. As a result, Fabry-Perot cavities are also mounted along the arms, increasing the effective optical length of the arms up to 1200 km 1200 km ~~1200km\approx 1200 \mathrm{~km}1200 km. The merger of two black holes provided the source of gravitational waves that were detected on 14 September, 2015 by both of the twin LIGO interferometers (Fig. 46.5). This particular gravitational-wave signal is thought to have been due to the inward spiral and merger of a pair of black holes, estimated to be around 36 M 36 M 36M_(o.)36 M_{\odot}36M and 29 M 29 M 29M_(o.)29 M_{\odot}29M, and the subsequent 'ringdown' of the single resulting black hole of mass 62 M 62 M 62M_(o.)62 M_{\odot}62M, with the remaining 3 M c 2 3 M c 2 3M_(o.)c^(2)3 M_{\odot} c^{2}3Mc2 energy radiated as gravitational waves. This measurement and subsequent ones have provided
24 24 ^(24){ }^{24}24 Albert A. Michelson (1952-1931). The Michelson-Morley experiment was, of course, very important in the development of special relativity, although it didn't seem to be one of Einstein's main motivations. It did, however, represent the key test of a v 2 / c 2 v 2 / c 2 v^(2)//c^(2)v^{2} / c^{2}v2/c2 correction to the pre-relativistic theory, predicted by Lorentz on the strength of the ether theory using a locally defined time. The null result of the experiment led Lorentz to propose length contracled Lorentz to propose length contrac-
tion. See Cheng for an accessible account of the history.
Fig. 46.4 (a) A schematic of a gravitational-wave detector. (b,c) The gravitational-wave detector. (b,c) The
gravitational wave's effect on a test gravitational wave's effect on a
mass system and its corresponding effect on the arms of the interferometer.
Fig. 46.5 The gravitational-wave event GW150914 observed by the LIGO Hanford (H1, left column panels) and Livingston (L1, right column panels) detectors. Times are shown relative to 14 September, 2015 at 09:50:45 UTC. [From B. P. Abbott et al. Phys. Rev. Lett. 116, 061102 (2016), DOI: 10.1103/PhysRevLett.116.061102, published by the American Physical Society under the terms of the Creative Commons Attribution 3.0 License.]
some of the most stringent experimental verifications of general relativity.
These results have stimulated the building of further gravitational wave observatories, including the Einstein Telescope and Cosmic Explorer which are planned to achieve an order of magnitude increase in sensitivity and would therefore be able to study the evolution of compact objects in the early Universe. However, other projects are aimed at building an observatory a long way from the ground. The proposed space-based Laser Interferometer Space Antenna (LISA) will be able to explore much lower frequency gravitational waves (from around 100 μ Hz 100 μ Hz 100 muHz100 \mu \mathrm{~Hz}100μ Hz to 100 mHz ). It is expected to be capable of detecting at very high redshift the first seed black holes formed, as well as intermediate-mass and 'light' super-massive coalescing black hole systems in the 10 2 10 7 M 10 2 10 7 M 10^(2)-10^(7)M_(o.)10^{2}-10^{7} M_{\odot}102107M range. It should therefore be able to follow the evolution of black holes right from the early Universe.
To get to really low frequencies, the best technology for detecting gravitational waves is the use of pulsar timing arrays. These work in the nanohertz to microhertz frequency band and can be used to detect gravitational-wave remnants from the past mergers of supermassive black holes. The basic idea is that rather than using laser interferometers (as used in both LIGO and LISA), one measures the pulse arrival time at Earth from an array of millsecond pulsars (i.e. rapidly rotating neutron stars). These pulsars have extremely regular and stable periods and act as ideal timing sources. A gravitational wave emitted from some astrophysical source will pass the pulsar and the Earth and so this will produce two perturbations on the signal received from the pulsar: one from spacetime variations at the pulsar and the other from spacetime
variations on Earth. Data from the pulsar arrays have to be accumulated over many years, so these experiments are not quick.
The stochastic background of remnant primordial gravitational waves produced during the Big Bang will dominate the gravitational-wave spectrum down to approximately 10 18 Hz 10 18 Hz 10^(-18)Hz10^{-18} \mathrm{~Hz}1018 Hz and depending on the cosmological model then some of this background may lie in a region that could be probed by the technologies discussed above. This is an open question, but the modern era of gravitational-wave astronomy that has just begun looks like it is going to have an exciting and productive future in the coming decades.
We have seen in this chapter how general relativity predicts the existence of wave-like excitations in the gravitational field. In the next chapter, we examine gravitational waves from another point of view: that of quantum fields, where these waves are quantized into force-carrying particles known as gravitons.

Chapter summary

  • A gravitational wave solution to the linearized Einstein equation has the form h ¯ μ ν = Re [ A μ ν e i k x ] h ¯ μ ν = Re A μ ν e i k x bar(h)_(mu nu)=Re[A_(mu nu)e^(ik*x)]\bar{h}_{\mu \nu}=\operatorname{Re}\left[A_{\mu \nu} \mathrm{e}^{\mathrm{i} \boldsymbol{k} \cdot \boldsymbol{x}}\right]h¯μν=Re[Aμνeikx]. The waves have null k k k\boldsymbol{k}k and amplitude orthogonal to the direction of propagation.
  • Transverse-traceless gauge can be used to simplify the solutions, leading to + and × × xx\times× polarizations which are 45 45 45^(@)45^{\circ}45 out of phase.
  • There are wave solutions containing nonlinear terms that solve the Einstein equations.
  • The gravitational wave luminosity of a source is given by
L = d E d t = G 5 c 5 I i j I i j , L = d E d t = G 5 c 5 I i j I i j , L=(dE)/((d)t)=(G)/(5c^(5))(:I^(⃛)_(ij)I^(⃛)_(ij):),L=\frac{\mathrm{d} E}{\mathrm{~d} t}=\frac{G}{5 c^{5}}\left\langle\dddot{I}_{i j} \dddot{I}_{i j}\right\rangle,L=dE dt=G5c5IijIij,
and the radiation is quadrupolar.
  • Gravitational wave astronomy is a rapidly growing area in astrophysics. The laser interferometers that make up the ground-based LIGO or the space-based LISA have, or will have, extraordinary sensitivity. Lower frequency gravitational waves are expected to be detected by pulsar timing arrays.

Exercises

(46.1) Verify the components in eqn 46.30 .
(46.2) (a) Recall that the connection coefficients are
Γ μ ν α = 1 2 g α λ ( μ g ν λ + ν g λ μ λ g μ ν ) Γ μ ν α = 1 2 g α λ μ g ν λ + ν g λ μ λ g μ ν Gamma_(mu nu)^(alpha)=(1)/(2)g^(alpha lambda)(del_(mu)g_(nu lambda)+del_(nu)g_(lambda mu)-del_(lambda)g_(mu nu))\Gamma_{\mu \nu}^{\alpha}=\frac{1}{2} g^{\alpha \lambda}\left(\partial_{\mu} g_{\nu \lambda}+\partial_{\nu} g_{\lambda \mu}-\partial_{\lambda} g_{\mu \nu}\right)Γμνα=12gαλ(μgνλ+νgλμλgμν)
and the Ricci tensor is
R μ ν = α Γ α ν μ ν Γ α α μ + Γ β β α Γ α ν μ Γ β ν α Γ α β μ R μ ν = α Γ α ν μ ν Γ α α μ + Γ β β α Γ α ν μ Γ β ν α Γ α β μ R_(mu nu)=del_(alpha)Gamma^(alpha)_(nu mu)-del_(nu)Gamma^(alpha)_(alpha mu)+Gamma^(beta)_(beta alpha)Gamma^(alpha)_(nu mu)-Gamma^(beta)_(nu alpha)Gamma^(alpha)_(beta mu)R_{\mu \nu}=\partial_{\alpha} \Gamma^{\alpha}{ }_{\nu \mu}-\partial_{\nu} \Gamma^{\alpha}{ }_{\alpha \mu}+\Gamma^{\beta}{ }_{\beta \alpha} \Gamma^{\alpha}{ }_{\nu \mu}-\Gamma^{\beta}{ }_{\nu \alpha} \Gamma^{\alpha}{ }_{\beta \mu}Rμν=αΓανμνΓααμ+ΓββαΓανμΓβναΓαβμ.
( 46.80 ) ( 46.80 ) (46.80)(46.80)(46.80)
In linearized gravity, we take g μ ν = η μ ν + h μ ν g μ ν = η μ ν + h μ ν g_(mu nu)=eta_(mu nu)+h_(mu nu)g_{\mu \nu}=\eta_{\mu \nu}+h_{\mu \nu}gμν=ημν+hμν so that g μ ν = η μ ν h μ ν g μ ν = η μ ν h μ ν g^(mu nu)=eta^(mu nu)-h^(mu nu)g^{\mu \nu}=\eta^{\mu \nu}-h^{\mu \nu}gμν=ημνhμν (and hence g μ α g α ν = g μ α g α ν = g_(mu alpha)g^(alpha nu)=g_{\mu \alpha} g^{\alpha \nu}=gμαgαν= δ μ ν + O ( h 2 ) δ μ ν + O h 2 delta_(mu)^(nu)+O(h^(2))\delta_{\mu}^{\nu}+\mathrm{O}\left(h^{2}\right)δμν+O(h2) ). The tensor field h μ ν h μ ν h_(mu nu)h_{\mu \nu}hμν is symmetric, so we are free to swap indices around since h μ ν = h ν μ h μ ν = h ν μ h_(mu nu)=h_(nu mu)h_{\mu \nu}=h_{\nu \mu}hμν=hνμ. Hence, show that the quadratic contribution to the Ricci tensor is
R μ ν ( 2 ) = 1 2 α ( h α λ [ ν h μ λ + μ h λ ν λ h ν μ ] ) + 1 2 ν ( h α λ [ α h μ λ + μ h λ α λ h α μ ] ) + 1 4 η β λ η α ξ ( β h α λ + α h λ β λ h β α ) × ( ν h μ ξ + μ h ξ ν ξ h ν μ ) 1 4 η β λ η α ξ ( ν h α λ + α h λ ν λ h ν α ) (46.81) × ( β h μ ξ + μ h ξ β ξ h β μ ) . R μ ν ( 2 ) = 1 2 α h α λ ν h μ λ + μ h λ ν λ h ν μ + 1 2 ν h α λ α h μ λ + μ h λ α λ h α μ + 1 4 η β λ η α ξ β h α λ + α h λ β λ h β α × ν h μ ξ + μ h ξ ν ξ h ν μ 1 4 η β λ η α ξ ν h α λ + α h λ ν λ h ν α (46.81) × β h μ ξ + μ h ξ β ξ h β μ . {:[R_(mu nu)^((2))=-(1)/(2)del_(alpha)(h^(alpha lambda)[del_(nu)h_(mu lambda)+del_(mu)h_(lambda nu)-del_(lambda)h_(nu mu)])],[+(1)/(2)del_(nu)(h^(alpha lambda)[del_(alpha)h_(mu lambda)+del_(mu)h_(lambda alpha)-del_(lambda)h_(alpha mu)])],[+(1)/(4)eta^(beta lambda)eta^(alpha xi)(del_(beta)h_(alpha lambda)+del_(alpha)h_(lambda beta)-del_(lambda)h_(beta alpha))],[ xx(del_(nu)h_(mu xi)+del_(mu)h_(xi nu)-del_(xi)h_(nu mu))],[-(1)/(4)eta^(beta lambda)eta^(alpha xi)(del_(nu)h_(alpha lambda)+del_(alpha)h_(lambda nu)-del_(lambda)h_(nu alpha))],[(46.81) xx(del_(beta)h_(mu xi)+del_(mu)h_(xi beta)-del_(xi)h_(beta mu)).]:}\begin{align*} R_{\mu \nu}^{(2)}= & -\frac{1}{2} \partial_{\alpha}\left(h^{\alpha \lambda}\left[\partial_{\nu} h_{\mu \lambda}+\partial_{\mu} h_{\lambda \nu}-\partial_{\lambda} h_{\nu \mu}\right]\right) \\ + & \frac{1}{2} \partial_{\nu}\left(h^{\alpha \lambda}\left[\partial_{\alpha} h_{\mu \lambda}+\partial_{\mu} h_{\lambda \alpha}-\partial_{\lambda} h_{\alpha \mu}\right]\right) \\ + & \frac{1}{4} \eta^{\beta \lambda} \eta^{\alpha \xi}\left(\partial_{\beta} h_{\alpha \lambda}+\partial_{\alpha} h_{\lambda \beta}-\partial_{\lambda} h_{\beta \alpha}\right) \\ & \times\left(\partial_{\nu} h_{\mu \xi}+\partial_{\mu} h_{\xi \nu}-\partial_{\xi} h_{\nu \mu}\right) \\ - & \frac{1}{4} \eta^{\beta \lambda} \eta^{\alpha \xi}\left(\partial_{\nu} h_{\alpha \lambda}+\partial_{\alpha} h_{\lambda \nu}-\partial_{\lambda} h_{\nu \alpha}\right) \\ & \times\left(\partial_{\beta} h_{\mu \xi}+\partial_{\mu} h_{\xi \beta}-\partial_{\xi} h_{\beta \mu}\right) . \tag{46.81} \end{align*}Rμν(2)=12α(hαλ[νhμλ+μhλνλhνμ])+12ν(hαλ[αhμλ+μhλαλhαμ])+14ηβληαξ(βhαλ+αhλβλhβα)×(νhμξ+μhξνξhνμ)14ηβληαξ(νhαλ+αhλνλhνα)(46.81)×(βhμξ+μhξβξhβμ).
(b) Group the terms in eqn 46.81 and show that it reduces to
R μ ν ( 2 ) = 1 2 [ 1 2 μ h α β ν h α β 1 + h α β ν μ h α β 2 + ( h α β ν α h μ β ) 3 + ( h α β β μ h α ν ) 3 + h α β α β h μ ν 4 + α h μ β α h β ν 5 + ( α h μ β β h α ν ) 6 + ( β h α β ν h μ α ) 7 + α h α β β h μ ν 8 + ( α h α β μ h β ν ) 9 + ( 1 2 α h μ ν α h ) 10 + 1 2 ν h α μ α h 11 (46.82) + 1 2 μ h ν α α h 12 R μ ν ( 2 ) = 1 2 [ 1 2 μ h α β ν h α β 1 + h α β ν μ h α β 2 + h α β ν α h μ β 3 + h α β β μ h α ν 3 + h α β α β h μ ν 4 + α h μ β α h β ν 5 + α h μ β β h α ν 6 + β h α β ν h μ α 7 + α h α β β h μ ν 8 + α h α β μ h β ν 9 + 1 2 α h μ ν α h 10 + 1 2 ν h α μ α h 11 (46.82) + 1 2 μ h ν α α h 12 {:[R_(mu nu)^((2))=(1)/(2)[ubrace((1)/(2)del_(mu)h_(alpha beta)del_(nu)h^(alpha beta)ubrace)_(1)+ubrace(h^(alpha beta)del_(nu)del_(mu)h_(alpha beta)ubrace)_(2)],[+ubrace((-h^(alpha beta)del_(nu)del_(alpha)h_(mu beta))ubrace)_(3)+ubrace((-h^(alpha beta)del_(beta)del_(mu)h_(alpha nu))ubrace)_(3)],[+ubrace(h^(alpha beta)del_(alpha)del_(beta)h_(mu nu)ubrace)_(4)+ubrace(del^(alpha)h_(mu)^(beta)del_(alpha)h_(beta nu)ubrace)_(5)],[+ubrace((-del^(alpha)h_(mu)^(beta)del_(beta)h_(alpha nu))ubrace)_(6)+ubrace((-del_(beta)h^(alpha beta)del_(nu)h_(mu alpha))ubrace)_(7)],[+ubrace(del_(alpha)h^(alpha beta)del_(beta)h_(mu nu)ubrace)_(8)+ubrace((-del_(alpha)h^(alpha beta)del_(mu)h_(beta nu))ubrace)_(9)],[+ubrace((-(1)/(2)del_(alpha)h_(mu nu)del^(alpha)h)ubrace)_(10)+ubrace((1)/(2)del_(nu)h_(alpha mu)del^(alpha)hubrace)_(11)],[(46.82)+ubrace((1)/(2)del_(mu)h_(nu)^(alpha)del_(alpha)hubrace)_(12)]:}\begin{align*} R_{\mu \nu}^{(2)} & =\frac{1}{2}[\underbrace{\frac{1}{2} \partial_{\mu} h_{\alpha \beta} \partial_{\nu} h^{\alpha \beta}}_{1}+\underbrace{h^{\alpha \beta} \partial_{\nu} \partial_{\mu} h_{\alpha \beta}}_{2} \\ & +\underbrace{\left(-h^{\alpha \beta} \partial_{\nu} \partial_{\alpha} h_{\mu \beta}\right)}_{3}+\underbrace{\left(-h^{\alpha \beta} \partial_{\beta} \partial_{\mu} h_{\alpha \nu}\right)}_{3} \\ & +\underbrace{h^{\alpha \beta} \partial_{\alpha} \partial_{\beta} h_{\mu \nu}}_{4}+\underbrace{\partial^{\alpha} h_{\mu}^{\beta} \partial_{\alpha} h_{\beta \nu}}_{5} \\ & +\underbrace{\left(-\partial^{\alpha} h_{\mu}^{\beta} \partial_{\beta} h_{\alpha \nu}\right)}_{6}+\underbrace{\left(-\partial_{\beta} h^{\alpha \beta} \partial_{\nu} h_{\mu \alpha}\right)}_{7} \\ & +\underbrace{\partial_{\alpha} h^{\alpha \beta} \partial_{\beta} h_{\mu \nu}}_{8}+\underbrace{\left(-\partial_{\alpha} h^{\alpha \beta} \partial_{\mu} h_{\beta \nu}\right)}_{9} \\ & +\underbrace{\left(-\frac{1}{2} \partial_{\alpha} h_{\mu \nu} \partial^{\alpha} h\right)}_{10}+\underbrace{\frac{1}{2} \partial_{\nu} h_{\alpha \mu} \partial^{\alpha} h}_{11} \\ & +\underbrace{\frac{1}{2} \partial_{\mu} h_{\nu}^{\alpha} \partial_{\alpha} h}_{12} \tag{46.82} \end{align*}Rμν(2)=12[12μhαβνhαβ1+hαβνμhαβ2+(hαβναhμβ)3+(hαββμhαν)3+hαβαβhμν4+αhμβαhβν5+(αhμββhαν)6+(βhαβνhμα)7+αhαββhμν8+(αhαβμhβν)9+(12αhμναh)10+12νhαμαh11(46.82)+12μhνααh12
where h = h α α h = h α α h=h_(alpha)^(alpha)h=h_{\alpha}{ }^{\alpha}h=hαα.
(c) Show that the only two terms that survive in this expression are the first two, using
  • Tracelessness ( h = 0 ) ( h = 0 ) (h=0)(h=0)(h=0) [this annihilates terms 11,12 , and 13].
  • The gauge condition μ h μ ν = 0 μ h μ ν = 0 del_(mu)h^(mu nu)=0\partial_{\mu} h^{\mu \nu}=0μhμν=0 [this annihilates 4 , 8 , 9 4 , 8 , 9 4,8,94,8,94,8,9, and 10].
  • Any divergence vanishes on the boundary (after averaging over a volume). [This results in 3,5 , and 7 vanishing once you also apply the gauge condition, and 6 similarly going once you use the field equations 2 h α μ = 0 ] 2 h α μ = 0 {:del^(2)h_(alpha mu)=0]\left.\partial^{2} h_{\alpha \mu}=0\right]2hαμ=0].
    Show further that the Ricci scalar vanishes using these conditions.
    (d) Hence, show that
(46.83) G μ ν ( 2 ) = 1 4 μ h α β ν h α β (46.83) G μ ν ( 2 ) = 1 4 μ h α β ν h α β {:(46.83)(:G_(mu nu)^((2)):)=-(1)/(4)(:del_(mu)h_(alpha beta)del_(nu)h^(alpha beta):):}\begin{equation*} \left\langle G_{\mu \nu}^{(2)}\right\rangle=-\frac{1}{4}\left\langle\partial_{\mu} h_{\alpha \beta} \partial_{\nu} h^{\alpha \beta}\right\rangle \tag{46.83} \end{equation*}(46.83)Gμν(2)=14μhαβνhαβ
(46.3) (a) Expand the conservation law T μ ν , μ = 0 T μ ν , μ = 0 T^(mu nu)_(,mu)=0T^{\mu \nu}{ }_{, \mu}=0Tμν,μ=0 and show that
(46.84) 0 T 00 = k T k 0 (46.85) 0 T 0 i = k T k i (46.84) 0 T 00 = k T k 0 (46.85) 0 T 0 i = k T k i {:[(46.84)del_(0)T^(00)=-del_(k)T^(k0)],[(46.85)del_(0)T^(0i)=-del_(k)T^(ki)]:}\begin{align*} \partial_{0} T^{00} & =-\partial_{k} T^{k 0} \tag{46.84}\\ \partial_{0} T^{0 i} & =-\partial_{k} T^{k i} \tag{46.85} \end{align*}(46.84)0T00=kTk0(46.85)0T0i=kTki
(b) Define the tensor field I i j I i j I^(ij)\mathfrak{I}^{i j}Iij by
(46.86) J i j = x i x j T 00 (46.86) J i j = x i x j T 00 {:(46.86)J^(ij)=x^(i)x^(j)T^(00):}\begin{equation*} \mathfrak{J}^{i j}=x^{i} x^{j} T^{00} \tag{46.86} \end{equation*}(46.86)Jij=xixjT00
This is related to the moment of inertia tensor I i j I i j I_(ij)I_{i j}Iij via
(46.87) I i j = Σ d 3 y J i j (46.87) I i j = Σ d 3 y J i j {:(46.87)I_(ij)=int_(Sigma)d^(3)yJ_(ij):}\begin{equation*} I_{i j}=\int_{\Sigma} \mathrm{d}^{3} y \mathfrak{J}_{i j} \tag{46.87} \end{equation*}(46.87)Iij=Σd3yJij
Using the results in (a), show that I i j = 0 I i j I i j = 0 I i j I_(ij)=del_(0)I_(ij)\mathfrak{I}_{i j}=\partial_{0} \mathfrak{I}_{i j}Iij=0Iij is given by
(46.88) J ˙ i j = k ( x i x j T k 0 ) + x j T i 0 + x i T j 0 (46.88) J ˙ i j = k x i x j T k 0 + x j T i 0 + x i T j 0 {:(46.88)J^(˙)_(ij)=-del_(k)(x^(i)x^(j)T^(k0))+x^(j)T^(i0)+x^(i)T^(j0):}\begin{equation*} \dot{\mathfrak{J}}_{i j}=-\partial_{k}\left(x^{i} x^{j} T^{k 0}\right)+x^{j} T^{i 0}+x^{i} T^{j 0} \tag{46.88} \end{equation*}(46.88)J˙ij=k(xixjTk0)+xjTi0+xiTj0
Show further that
(46.89) J ¨ i j = k ( x i x j 0 T k 0 + x j T i k + x i T j k ) + 2 T i j (46.89) J ¨ i j = k x i x j 0 T k 0 + x j T i k + x i T j k + 2 T i j {:(46.89)J^(¨)_(ij)=-del_(k)(x^(i)x^(j)del_(0)T^(k0)+x^(j)T^(ik)+x^(i)T^(jk))+2T^(ij):}\begin{equation*} \ddot{\mathfrak{J}}_{i j}=-\partial_{k}\left(x^{i} x^{j} \partial_{0} T^{k 0}+x^{j} T^{i k}+x^{i} T^{j k}\right)+2 T^{i j} \tag{46.89} \end{equation*}(46.89)J¨ij=k(xixj0Tk0+xjTik+xiTjk)+2Tij
(c) Integrate eqn 46.89 over the region Σ Σ Sigma\SigmaΣ and show that
(46.90) I ¨ i j = Σ d 3 y J ¨ i j = 2 Σ d 3 y T i j (46.90) I ¨ i j = Σ d 3 y J ¨ i j = 2 Σ d 3 y T i j {:(46.90)I^(¨)_(ij)=int_(Sigma)d^(3)yJ^(¨)_(ij)=2int_(Sigma)d^(3)yT_(ij):}\begin{equation*} \ddot{I}_{i j}=\int_{\Sigma} \mathrm{d}^{3} y \ddot{\mathfrak{J}}_{i j}=2 \int_{\Sigma} \mathrm{d}^{3} y T_{i j} \tag{46.90} \end{equation*}(46.90)I¨ij=Σd3yJ¨ij=2Σd3yTij
(d) Use eqn 46.90 to prove eqn 46.43 , i.e. show that
(46.91) h ¯ i j ( t , x ) = 2 G R 2 t 2 Σ d 3 y [ y i y j T 00 ( t R , y ) ] (46.91) h ¯ i j ( t , x ) = 2 G R 2 t 2 Σ d 3 y y i y j T 00 ( t R , y ) {:(46.91) bar(h)_(ij)(t"," vec(x))=(2G)/(R)*(del^(2))/(delt^(2))*int_(Sigma)d^(3)y[y_(i)y_(j)T^(00)(t-R,( vec(y)))]:}\begin{equation*} \bar{h}_{i j}(t, \vec{x})=\frac{2 G}{R} \cdot \frac{\partial^{2}}{\partial t^{2}} \cdot \int_{\Sigma} \mathrm{d}^{3} y\left[y_{i} y_{j} T^{00}(t-R, \vec{y})\right] \tag{46.91} \end{equation*}(46.91)h¯ij(t,x)=2GR2t2Σd3y[yiyjT00(tR,y)]
(46.4) In Newtonian gravitation, typical velocities and accelerations are v 2 G m / r v 2 G m / r v^(2)~~Gm//rv^{2} \approx G m / rv2Gm/r and a G m / r 2 a G m / r 2 a~~Gm//r^(2)a \approx G m / r^{2}aGm/r2, respectively.
(a) Using the result of the previous problem, show that we expect
(46.92) h G m R v 2 (46.92) h G m R v 2 {:(46.92)h~~(Gm)/(R)*v^(2):}\begin{equation*} h \approx \frac{G m}{R} \cdot v^{2} \tag{46.92} \end{equation*}(46.92)hGmRv2
(b) The second-order terms in the weak-field expansion suggest the gravitational energy-density t t ttt varies as t ( h ˙ ) 2 / G t ( h ˙ ) 2 / G t~~(h^(˙))^(2)//Gt \approx(\dot{h})^{2} / Gt(h˙)2/G. Show that
(46.93) t 1 R G 4 m 5 r 5 (46.93) t 1 R G 4 m 5 r 5 {:(46.93)t~~(1)/(R)(G^(4)m^(5))/(r^(5)):}\begin{equation*} t \approx \frac{1}{R} \frac{G^{4} m^{5}}{r^{5}} \tag{46.93} \end{equation*}(46.93)t1RG4m5r5
(c) Integrating over a sphere of radius R R RRR and restoring factors, show further that the power radiated via gravitational radiation is approximately
(46.94) P G 4 m 5 r 5 c 5 (46.94) P G 4 m 5 r 5 c 5 {:(46.94)P~~(G^(4)m^(5))/(r^(5)c^(5)):}\begin{equation*} P \approx \frac{G^{4} m^{5}}{r^{5} c^{5}} \tag{46.94} \end{equation*}(46.94)PG4m5r5c5
(d) Estimate the power radiated by (i) the solar system, (ii) a collapsing binary star, formed from two stellar-mass black holes, and (iii) a fist, shaken in anger.
(46.5) In Exercise 30.7, we considered a projection operator P i j P i j P_(ij)P_{i j}Pij which projected a vector onto a unit spatial vector n = ( n 1 , n 2 , n 3 ) n = n 1 , n 2 , n 3 vec(n)=(n_(1),n_(2),n_(3))\vec{n}=\left(n_{1}, n_{2}, n_{3}\right)n=(n1,n2,n3). We now want to find a traceless version of the same thing.
(a) Show that P i j = δ i j n i n j P i j = δ i j n i n j P_(ij)=delta_(ij)-n_(i)n_(j)P_{i j}=\delta_{i j}-n_{i} n_{j}Pij=δijninj acts as a projection operator on a vector v v vec(v)\vec{v}v so that n i P i j v j = 0 n i P i j v j = 0 n_(i)P_(ij)v_(j)=0n_{i} P_{i j} v_{j}=0niPijvj=0. Show also that Tr P = P i i = 2 Tr P = P i i = 2 Tr P=P_(ii)=2\operatorname{Tr} P=P_{i i}=2TrP=Pii=2 and P i j P j k = P i k P i j P j k = P i k P_(ij)P_(jk)=P_(ik)P_{i j} P_{j k}=P_{i k}PijPjk=Pik.
(b) The action of this projection operator on a tensor M i j M i j M_(ij)M_{i j}Mij requires it to be used twice, so that
(46.95) M k = P i k P j M i j (46.95) M k = P i k P j M i j {:(46.95)M_(kℓ)^(')=P_(ik)P_(jℓ)M_(ij):}\begin{equation*} M_{k \ell}^{\prime}=P_{i k} P_{j \ell} M_{i j} \tag{46.95} \end{equation*}(46.95)Mk=PikPjMij
and show that M k n k = M k n = 0 M k n k = M k n = 0 M_(kℓ)^(')n_(k)=M_(kℓ)^(')n_(ℓ)=0M_{k \ell}^{\prime} n_{k}=M_{k \ell}^{\prime} n_{\ell}=0Mknk=Mkn=0. However, it is not traceless, and you can show this by proving that M k k = Tr ( P M ) M k k = Tr ( P M ) M_(kk)^(')=Tr(PM)M_{k k}^{\prime}=\operatorname{Tr}(P M)Mkk=Tr(PM).
(c) The solution is to use the traceless transverse projection operator
(46.96) M k TT = ( P i k P j 1 2 P k P i j ) M i j (46.96) M k TT = P i k P j 1 2 P k P i j M i j {:(46.96)M_(kℓ)^(TT)=(P_(ik)P_(jℓ)-(1)/(2)*P_(kℓ)P_(ij))M_(ij):}\begin{equation*} M_{k \ell}^{\mathrm{TT}}=\left(P_{i k} P_{j \ell}-\frac{1}{2} \cdot P_{k \ell} P_{i j}\right) M_{i j} \tag{46.96} \end{equation*}(46.96)MkTT=(PikPj12PkPij)Mij
and show that this is traceless (i.e. M k k TT = 0 M k k TT = 0 M_(kk)^(TT)=0M_{k k}^{\mathrm{TT}}=0MkkTT=0 ).
(d) If M M MMM itself is traceless and symmetric, show that
M i j TT M TT i j = ( P i k P j P i m P j n P i k P j P i j P m n + 1 4 P i j P i j P k P m n ) M k M m n = M i j M i j 2 n a n b M i a M j b (46.97) + 1 2 n a n b n c n d M a b M c d . M i j TT M TT i j = P i k P j P i m P j n P i k P j P i j P m n + 1 4 P i j P i j P k P m n M k M m n = M i j M i j 2 n a n b M i a M j b (46.97) + 1 2 n a n b n c n d M a b M c d . {:[M_(ij)^(TT)M_(TT)^(ij)=(P_(ik)P_(jℓ)P_(im)P_(jn)-P_(ik)P_(jℓ)P_(ij)P_(mn):}],[{:+(1)/(4)P_(ij)P_(ij)P_(kℓ)P_(mn))M_(kℓ)M_(mn)],[=M_(ij)M_(ij)-2n_(a)n_(b)M_(ia)M_(jb)],[(46.97)+(1)/(2)n_(a)n_(b)n_(c)n_(d)M_(ab)M_(cd).]:}\begin{align*} M_{i j}^{\mathrm{TT}} M_{\mathrm{TT}}^{i j} & =\left(P_{i k} P_{j \ell} P_{i m} P_{j n}-P_{i k} P_{j \ell} P_{i j} P_{m n}\right. \\ & \left.+\frac{1}{4} P_{i j} P_{i j} P_{k \ell} P_{m n}\right) M_{k \ell} M_{m n} \\ & =M_{i j} M_{i j}-2 n_{a} n_{b} M_{i a} M_{j b} \\ & +\frac{1}{2} n_{a} n_{b} n_{c} n_{d} M_{a b} M_{c d} . \tag{46.97} \end{align*}MijTTMTTij=(PikPjPimPjnPikPjPijPmn+14PijPijPkPmn)MkMmn=MijMij2nanbMiaMjb(46.97)+12nanbncndMabMcd.
(e) The integral over all solid angles d Ω = 4 π d Ω = 4 π intdOmega=4pi\int \mathrm{d} \Omega=4 \pidΩ=4π. Show also that
(46.98) n i n j d Ω = 4 π 3 δ i j (46.98) n i n j d Ω = 4 π 3 δ i j {:(46.98)intn_(i)n_(j)dOmega=(4pi)/(3)*delta_(ij):}\begin{equation*} \int n_{i} n_{j} \mathrm{~d} \Omega=\frac{4 \pi}{3} \cdot \delta_{i j} \tag{46.98} \end{equation*}(46.98)ninj dΩ=4π3δij
and
(46.99) n i n j n k n d Ω = 4 π 15 ( δ i j δ k + δ i k δ j + δ i δ j k ) (46.99) n i n j n k n d Ω = 4 π 15 δ i j δ k + δ i k δ j + δ i δ j k {:(46.99)intn_(i)n_(j)n_(k)n_(ℓ)dOmega=(4pi)/(15)(delta_(ij)delta_(kℓ)+delta_(ik)delta_(jℓ)+delta_(iℓ)delta_(jk)):}\begin{equation*} \int n_{i} n_{j} n_{k} n_{\ell} \mathrm{d} \Omega=\frac{4 \pi}{15}\left(\delta_{i j} \delta_{k \ell}+\delta_{i k} \delta_{j \ell}+\delta_{i \ell} \delta_{j k}\right) \tag{46.99} \end{equation*}(46.99)ninjnkndΩ=4π15(δijδk+δikδj+δiδjk)
Hence, show that
(46.100) I i j TT I i j TT d Ω = 24 π 15 I i j I i j (46.100) I i j TT I i j TT d Ω = 24 π 15 I i j I i j {:(46.100)int(:I^(⃛)_(ij)^(TT)I^(⃛)_(ij)^(TT):)dOmega=(24 pi)/(15)(:I^(⃛)_(ij)I^(⃛)_(ij):):}\begin{equation*} \int\left\langle\dddot{I}_{i j}^{\mathrm{TT}} \dddot{I}_{i j}^{\mathrm{TT}}\right\rangle \mathrm{d} \Omega=\frac{24 \pi}{15}\left\langle\dddot{I}_{i j} \dddot{I}_{i j}\right\rangle \tag{46.100} \end{equation*}(46.100)IijTTIijTTdΩ=24π15IijIij
This can be used to prove eqn 46.60.
(46.6) Consider two black holes of the same mass M M MMM in a circular orbit of radius a a aaa around their common centre of mass. At a time t t ttt the stars are at positions with Cartesian coordinates
(46.101) ( x , y , z ) = ( a cos ω t , a sin ω t , 0 ) (46.101) ( x , y , z ) = ( a cos ω t , a sin ω t , 0 ) {:(46.101)(x","y","z)=(a cos omega t","a sin omega t","0):}\begin{equation*} (x, y, z)=(a \cos \omega t, a \sin \omega t, 0) \tag{46.101} \end{equation*}(46.101)(x,y,z)=(acosωt,asinωt,0)
and
(46.102) ( x , y , z ) = ( a cos ω t , a sin ω t , 0 ) (46.102) ( x , y , z ) = ( a cos ω t , a sin ω t , 0 ) {:(46.102)(x","y","z)=(-a cos omega t","-a sin omega t","0):}\begin{equation*} (x, y, z)=(-a \cos \omega t,-a \sin \omega t, 0) \tag{46.102} \end{equation*}(46.102)(x,y,z)=(acosωt,asinωt,0)
respectively.
(a) Show that the angular frequency of the circular motion is given by ω = ( G M / 4 a 3 ) 1 2 ω = G M / 4 a 3 1 2 omega=(GM//4a^(3))^((1)/(2))\omega=\left(G M / 4 a^{3}\right)^{\frac{1}{2}}ω=(GM/4a3)12.
(b) Show that the moment of inertia tensor is given by
I i j = 2 M a 2 ( cos 2 ω t cos ω t sin ω t 0 cos ω t sin ω t sin 2 ω t 0 0 0 0 ) = M a 2 ( 1 + cos 2 ω t sin 2 ω t 0 sin 2 ω t 1 cos 2 ω t 0 0 0 0 ) I i j = 2 M a 2 cos 2 ω t cos ω t sin ω t 0 cos ω t sin ω t sin 2 ω t 0 0 0 0 = M a 2 1 + cos 2 ω t sin 2 ω t 0 sin 2 ω t 1 cos 2 ω t 0 0 0 0 {:[I^(ij)=2Ma^(2)([cos^(2)omega t,cos omega t sin omega t,0],[cos omega t sin omega t,sin^(2)omega t,0],[0,0,0])],[=Ma^(2)([1+cos 2omega t,sin 2omega t,0],[sin 2omega t,1-cos 2omega t,0],[0,0,0])]:}\begin{aligned} I^{i j} & =2 M a^{2}\left(\begin{array}{ccc} \cos ^{2} \omega t & \cos \omega t \sin \omega t & 0 \\ \cos \omega t \sin \omega t & \sin ^{2} \omega t & 0 \\ 0 & 0 & 0 \end{array}\right) \\ & =M a^{2}\left(\begin{array}{ccc} 1+\cos 2 \omega t & \sin 2 \omega t & 0 \\ \sin 2 \omega t & 1-\cos 2 \omega t & 0 \\ 0 & 0 & 0 \end{array}\right) \end{aligned}Iij=2Ma2(cos2ωtcosωtsinωt0cosωtsinωtsin2ωt0000)=Ma2(1+cos2ωtsin2ωt0sin2ωt1cos2ωt0000)
(c) Show further that the oscillating gravitational field a distance R R RRR away is given by
h ¯ μ ν ( t , R ) = 8 M G a 2 ω 2 R × ( 0 0 0 0 0 cos 2 ω t r sin 2 ω t r 0 0 sin 2 ω t r cos 2 ω t r 0 0 0 0 0 ) h ¯ μ ν ( t , R ) = 8 M G a 2 ω 2 R × 0 0 0 0 0 cos 2 ω t r sin 2 ω t r 0 0 sin 2 ω t r cos 2 ω t r 0 0 0 0 0 {:[ bar(h)_(mu nu)(t","R)=-(8MGa^(2)omega^(2))/(R)],[ xx([0,0,0,0],[0,cos 2omegat_(r),sin 2omegat_(r),0],[0,sin 2omegat_(r),-cos 2omegat_(r),0],[0,0,0,0])]:}\begin{aligned} \bar{h}_{\mu \nu}(t, R)= & -\frac{8 M G a^{2} \omega^{2}}{R} \\ & \times\left(\begin{array}{cccc} 0 & 0 & 0 & 0 \\ 0 & \cos 2 \omega t_{\mathrm{r}} & \sin 2 \omega t_{r} & 0 \\ 0 & \sin 2 \omega t_{r} & -\cos 2 \omega t_{\mathrm{r}} & 0 \\ 0 & 0 & 0 & 0 \end{array}\right) \end{aligned}h¯μν(t,R)=8MGa2ω2R×(00000cos2ωtrsin2ωtr00sin2ωtrcos2ωtr00000)
where the retarded time is t r = t R t r = t R t_(r)=t-Rt_{r}=t-Rtr=tR. This equation tells us that the oscillating field has a frequency twice that of the orbit of the binary system. It is in the form of eqn 46.21 and so also describes the gravitational radiation emitted in the z z zzz-direction.
(d) Show finally that
I i j = 8 M a 2 ω 3 ( sin 2 ω t cos 2 ω t 0 cos 2 ω t sin 2 ω t 0 0 0 0 ) I i j = 8 M a 2 ω 3 sin 2 ω t cos 2 ω t 0 cos 2 ω t sin 2 ω t 0 0 0 0 I^(⃛)^(ij)=8Ma^(2)omega^(3)([sin 2omega t,-cos 2omega t,0],[-cos 2omega t,-sin 2omega t,0],[0,0,0])\dddot{I}^{i j}=8 M a^{2} \omega^{3}\left(\begin{array}{ccc} \sin 2 \omega t & -\cos 2 \omega t & 0 \\ -\cos 2 \omega t & -\sin 2 \omega t & 0 \\ 0 & 0 & 0 \end{array}\right)Iij=8Ma2ω3(sin2ωtcos2ωt0cos2ωtsin2ωt0000)
and hence
(46.106) I i j I i j = 128 M 2 a 4 ω 6 (46.106) I i j I i j = 128 M 2 a 4 ω 6 {:(46.106)(:I^(⃛)^(ij)I^(⃛)_(ij):)=128M^(2)a^(4)omega^(6):}\begin{equation*} \left\langle\dddot{I}^{i j} \dddot{I}_{i j}\right\rangle=128 M^{2} a^{4} \omega^{6} \tag{46.106} \end{equation*}(46.106)IijIij=128M2a4ω6
(46.7) Using the results of the previous problem, derive eqn 46.69 using the expression given in eqn 46.63.
47.1 Force-carrying particles 512 47.2 Photon propagation and polarization
1 1 ^(1){ }^{1}1 This is a question Richard Feynman (1918-1988) asked himself in his course on gravity. Our approach in this chapter follows Feynman's resulting Lec tures on Gravitation (1995). This was also one of the early paths taken by several scientists (including Feynman) attempting to formulate a quantum theory of gravity. Although informative in many ways, it would ultimately prove unsuccessful. For an introduction to the history, see A. Ashketar Quantum Gravity, arXiv:grqc/0410054v2 (2004).
2 2 ^(2){ }^{2}2 We won't assume any familiarity with the techniques of quantum field theory in this chapter.
\curvearrowright Since this chapter discusses a hypothetical particle (i.e. the graviton) it can be skipped on a first reading.

The properties of gravitons

Abstract

Though free to think and act, we are held together, like the stars in the firmament, with ties inseparable. These ties cannot be seen, but we can feel them. I cut myself in the finger, and it pains me: this finger is a part of me. I see a friend hurt, and it hurts me, too: my friend and I are one. And now I see stricken down an enemy, a lump of matter which, of all the lumps of matter in the universe, I care least for, and it still grieves me. Does this not prove that each of us is only part of a whole? Nikola Tesla (1856-1943)

Imagine a parallel world where civilization had formulated quantum field theory (QFT) but had no geometrical theory of gravitation. In seeking to describe gravity with the tools at hand, where would their reasoning take them? 1 1 ^(1){ }^{1}1 We shall suggest in this chapter that gravitational excitations would be a natural place for them to start. This would lead to the idea of a graviton, a force-carrying particle derived from the quantization of gravitational waves.
In this chapter, we therefore pick up the discussion of the gravitational interactions between masses from the point of view of field theory. We shall work in the weak-field limit, where gravitation is described by a field h ( x ) h ( x ) h(x)\boldsymbol{h}(x)h(x) and indices are raised and lowered by the Minkowski metric η η eta\boldsymbol{\eta}η. Our plan is to work out as much as we can about gravity waves by drawing on the concepts of QFT , 2 QFT , 2 QFT,^(2)\mathrm{QFT},{ }^{2}QFT,2 where the waves are quantized into forcecarrying graviton particles. Although we do not yet have a quantum field theory of gravitation, we can still make progress using the tools from field theory and, indeed, some candidate theories of gravitation predict the existence of gravitons. We shall see that, as in the previous two chapters, we can gain a certain amount of insight by comparing gravity waves to light waves, or in this quantum context, comparing photons to gravitons. The result will be a rather different way to think about gravitational interactions to that we have considered thus far.

47.1 Force-carrying particles

One of the most interesting things about particles is that they interact with each other. Hideki Yukawa's great insight was his suggestion that this interaction process itself involves particles. These force-carrying particles are slightly different to their more familiar cousins with whom we're already acquainted. Yukawa's idea centres around one key notion:
Particles interact by exchanging virtual,force-carrying particles.
Recall that a virtual particle is one defined as existing'off mass-shell'. 3 3 ^(3){ }^{3}3 Interactions are described by field theories.The pattern we've seen in classical field theories(such as electromagnetism and gravitation)is (i)that sources of a field tell the field how to arrange itself;(ii)the field then acts on the sources,via virtual particles,telling them how to move. In quantum electrodynamics(or QED,which is the quantum upgrade of classical electromagnetism),the sources of the fields are electric charges and currents,arranged into a current 4 -vector J J J\boldsymbol{J}J .The force-carrying particles are photons,which are the massless particle excitations of the electromagnetic field A ~ ( x ) A ~ ( x ) tilde(A)(x)\tilde{\boldsymbol{A}}(x)A~(x) .One explanation for the photon having zero mass is that a massless virtual particle is the only way to produce the Coulomb interaction potential V ( r ) 1 / r V ( r ) 1 / r V(r)prop1//rV(r) \propto 1 / rV(r)1/r .A massive force-carrying particle would lead to a potential that falls off more rapidly with dis- tance,as examined in the next example.

Example 47.1

The role of mass can be understood by considering the mathematical form of the interaction mediated by Yukawa's force-carrying particles.The Yukawa potential is written as U ( r ) e α m | r | 4 π | r | U ( r ) e α m | r | 4 π | r | U( vec(r))prop-(e^(-alpha m| vec(r)|))/(4pi|( vec(r))|)U(\vec{r}) \propto-\frac{\mathrm{e}^{-\alpha m|\vec{r}|}}{4 \pi|\vec{r}|}U(r)eαm|r|4π|r| ,where m m mmm is the mass of a force-carrying particle and α α alpha\alphaα is a parameter.This potential obeys the Green's function 4 4 ^(4){ }^{4}4 equation
(47.1) ( 2 α 2 m 2 ) U ( r ) = δ ( 3 ) ( r ) (47.1) 2 α 2 m 2 U ( r ) = δ ( 3 ) ( r ) {:(47.1)( vec(grad)^(2)-alpha^(2)m^(2))U( vec(r))=delta^((3))( vec(r)):}\begin{equation*} \left(\vec{\nabla}^{2}-\alpha^{2} m^{2}\right) U(\vec{r})=\delta^{(3)}(\vec{r}) \tag{47.1} \end{equation*}(47.1)(2α2m2)U(r)=δ(3)(r)
This equation is a form of the Klein-Gordon equation(see Chapter 40),which is the equation of motion for massive,spinless ( S = 0 ) ( S = 0 ) (S=0)(S=0)(S=0) particles.The effective potential representing the interaction mediated by such particles must also obey this equation of motion.
What is the analogous expression for electromagnetism?We know that the Coulomb potential V ( r ) 1 / | r | V ( r ) 1 / | r | V( vec(r))prop1//| vec(r)|V(\vec{r}) \propto 1 /|\vec{r}|V(r)1/|r| must obey Poisson's equation,which is the name we give the Green's function equation
(47.2) 2 V ( r ) = δ ( 3 ) ( r ) . (47.2) 2 V ( r ) = δ ( 3 ) ( r ) . {:(47.2) vec(grad)^(2)V( vec(r))=delta^((3))( vec(r)).:}\begin{equation*} \vec{\nabla}^{2} V(\vec{r})=\delta^{(3)}(\vec{r}) . \tag{47.2} \end{equation*}(47.2)2V(r)=δ(3)(r).
We see that everything is consistent if we set m = 0 m = 0 m=0m=0m=0 for the case of electromagnetism (corresponding to a massless photon).Conversely,if m 0 m 0 m!=0m \neq 0m0 ,then the electromagnetic interaction would have an exponential contribution e α m | r | e α m | r | e^(-alpha m| vec(r)|)\mathrm{e}^{-\alpha m|\vec{r}|}eαm|r| ,which it does not.
A massless photon necessarily has only two polarization states,associ- ated with the two polarizations of light.The photon also has a spin 5 5 ^(5){ }^{5}5 S = 1 S = 1 S=1S=1S=1 .We would now like to identify the analogous properties of the graviton,the particle excitation of the metric field of gravity that mediates the gravitational force.The gravitational potential varies as Φ ( r ) 1 / r Φ ( r ) 1 / r Phi(r)prop1//r\Phi(r) \propto 1 / rΦ(r)1/r and so this constrains the mass of the virtual particle to be m = 0 m = 0 m=0m=0m=0 .What is the spin of the graviton?A spin S = 1 S = 1 S=1S=1S=1 theory has the property that like charges repel and unlike charges attract. 6 6 ^(6){ }^{6}6 This is incompatible with gravity,which is purely attractive.Even-integer spin exchange leads exclusively to forces of one sign(either purely attractive
3 3 ^(3){ }^{3}3 See Chapter 28.We saw that the ar- gument is that quantum mechanics al- lows us to violate this classical disper- sion relation,as long as we don't do it for too long.By invoking energy-time uncertainty Δ E Δ t Δ E Δ t Delta E Delta t∼ℏ\Delta E \Delta t \sim \hbarΔEΔt ,we can say that particles of energy E E EEE are allowed to ex- ist off the mass-shell as long as they live for a short time Δ t / E Δ t / E Delta t≲ℏ//E\Delta t \lesssim \hbar / EΔt/E .Virtual particles,therefore,must have a finite range since they can't live forever and they travel at finite velocity.
4 4 ^(4){ }^{4}4 A Green's function G G GGG is the solution of a differential equation of the form L ^ G = δ L ^ G = δ hat(L)G=delta\hat{L} G=\deltaL^G=δ ,where L ^ L ^ hat(L)\hat{L}L^ is a linear operator and δ δ delta\deltaδ is a delta function.
5 5 ^(5){ }^{5}5 We summarize here the different spins and tensors associated with different field theories:
S=0 < (x) scalar
S=1 有位x) vector/1-form
S=2 商 (x) second-rank tensor.
6 6 ^(6){ }^{6}6 Or vice versa,if the coupling constant has opposite sign.
7 7 ^(7){ }^{7}7 See Exercise 42.13 .
Fig. 47.1 The exchange of a virtual photon causes two currents to interact.
8 8 ^(8){ }^{8}8 Owing to the symmetry of the interac tion, which is reflected in the diagram in Fig. 47.1, each particle actually plays both the role of the scattered particle and that of the source of the scattering potential. However, the argument presented here will get us to the right answer.
Fig. 47.2 One current acting as the source of the field with which the other interacts.
9 9 ^(9){ }^{9}9 We can see the motivation for this by combining eqn 47.3 with the interaction Lagrangian L = A μ ( x ) J μ ( x ) L = A μ ( x ) J μ ( x ) L=A_(mu)(x)J^(mu)(x)\mathcal{L}=A_{\mu}(x) J^{\mu}(x)L=Aμ(x)Jμ(x), which leads us to predict an interaction with the form J a 1 k 2 J b J a 1 k 2 J b J_(a)(1)/(k^(2))J_(b)\boldsymbol{J}_{a} \frac{1}{\boldsymbol{k}^{2}} \boldsymbol{J}_{b}Ja1k2Jb. This is indeed the case.
or purely repulsive), so the graviton must be one of S = 0 , 2 , 4 , S = 0 , 2 , 4 , S=0,2,4,dotsS=0,2,4, \ldotsS=0,2,4, A scalar field ϕ ( x ) ϕ ( x ) phi(x)\phi(x)ϕ(x) has S = 0 S = 0 S=0S=0S=0. However, an S = 0 S = 0 S=0S=0S=0 theory turns out to be too simple to capture gravitation 7 7 ^(7){ }^{7}7 (it makes incorrect predictions for the energetics of gravitation for one thing). We shall therefore assume that the graviton is a S = 2 S = 2 S=2S=2S=2 particle and, as a result, must be described by a symmetric tensor field. In order to extract some properties of the graviton, we first take a side step and examine the properties of the photon within quantum field theory. We shall then discuss gravitation by making exactly the same mathematical steps, substituting gravitation for electromagnetism. This will reveal the form of the graviton interaction and show the role of graviton polarization.

47.2 Photon propagation and polarization

We know that two electrons have interacted if, on approaching each other, their motion is altered by each other's presence. This is a rough description of scattering, where incoming particles change their momentum states owing to interactions. The probability of scattering is encoded in a quantum mechanical amplitude, and it is this that we shall compute in this section. A useful method of understanding and computing these amplitudes makes use of Feynman diagrams. These are momentum-space cartoons of the scattering processes that encode the equations involved in a perturbation expansion of the underlying quantum field theory. The simplest Feynman diagram for electromagnetic electron-electron interactions is shown in Fig. 47.1. Conceptually, the diagram can be understood in terms of the current representing the motion of one electron (called the b b bbb-particle for the sake of argument) being the source of the field A ~ ( x ) A ~ ( x ) tilde(A)(x)\tilde{\boldsymbol{A}}(x)A~(x) with which the current J a ( x ) J a ( x ) J_(a)(x)J_{a}(x)Ja(x) representing the other electron (the a a aaa-particle) interacts, as shown in Fig. 47.2. 8 8 ^(8){ }^{8}8
In general, an electromagnetic field A ~ A ~ tilde(A)\tilde{\boldsymbol{A}}A~ interacts with a current at a point x x xxx via a term in the Lagrangian L = J μ ( x ) A μ ( x ) L = J μ ( x ) A μ ( x ) L=J^(mu)(x)A_(mu)(x)\mathcal{L}=J^{\mu}(x) A_{\mu}(x)L=Jμ(x)Aμ(x). As we said above, the source of the electromagnetic field in this case is the current of the b b bbb-particle with components ( J b ) μ J b μ (J_(b))^(mu)\left(\boldsymbol{J}_{b}\right)^{\mu}(Jb)μ. The resulting A ~ A ~ tilde(A)\tilde{\boldsymbol{A}}A~ field has components that can be described (after a suitable choice of gauge) by the equation of motion 2 A μ = ( J b ) μ 2 A μ = J b μ del^(2)A^(mu)=-(J_(b))^(mu)\partial^{2} A^{\mu}=-\left(J_{b}\right)^{\mu}2Aμ=(Jb)μ or, equivalently, the momentum-space equation
(47.3) A μ ( k ) = 1 k 2 ( J b ) μ (47.3) A μ ( k ) = 1 k 2 J b μ {:(47.3)A^(mu)(k)=(1)/(k^(2))(J_(b))^(mu):}\begin{equation*} A^{\mu}(\boldsymbol{k})=\frac{1}{\boldsymbol{k}^{2}}\left(\boldsymbol{J}_{b}\right)^{\mu} \tag{47.3} \end{equation*}(47.3)Aμ(k)=1k2(Jb)μ
We can understand the interaction of this field with the other current by computing a scattering amplitude A A A\mathcal{A}A for the current of b b bbb-electrons ( J b ) J b (J_(b))\left(\boldsymbol{J}_{b}\right)(Jb) to interact with a current of a a aaa-electrons ( J a ) J a (J_(a))\left(\boldsymbol{J}_{a}\right)(Ja). This has the component form 9 9 ^(9){ }^{9}9
(47.4) i A = ( Current ) a μ ( Virtual-particle propagator ) μ ν ( Current ) b ν . (47.4) i A = (  Current  ) a μ (  Virtual-particle   propagator  ) μ ν (  Current  ) b ν . {:(47.4)iA=(" Current ")_(a)^(mu)((" Virtual-particle ")/(" propagator "))_(mu nu)(" Current ")_(b)^(nu).:}\begin{equation*} \mathrm{i} \mathcal{A}=(\text { Current })_{a}^{\mu}\binom{\text { Virtual-particle }}{\text { propagator }}_{\mu \nu}(\text { Current })_{b}^{\nu} . \tag{47.4} \end{equation*}(47.4)iA=( Current )aμ( Virtual-particle  propagator )μν( Current )bν.
The part in the middle, called the propagator tells us the probability amplitude for a virtual, force-carrying particle to interact with current
a a aaa and current b b bbb. This is the process shown in Fig. 47.1. The solid lines represent the currents; the wiggly line represents the photon propagator.
Working in momentum space, the flat-space photon propagator giving the amplitude describing a photon with wavevector k k k\boldsymbol{k}k is given by a tensor D D DDD with components 10 10 ^(10){ }^{10}10
(47.5) D ~ μ ν ( k ) = i η μ ν k 2 + i ε (47.5) D ~ μ ν ( k ) = i η μ ν k 2 + i ε {:(47.5) tilde(D)_(mu nu)(k)=(-ieta_(mu nu))/(k^(2)+iepsi):}\begin{equation*} \tilde{D}_{\mu \nu}(k)=\frac{-\mathrm{i} \eta_{\mu \nu}}{\boldsymbol{k}^{2}+\mathrm{i} \varepsilon} \tag{47.5} \end{equation*}(47.5)D~μν(k)=iημνk2+iε
where k k k\boldsymbol{k}k is the four-momentum of the photon. (The is factor deals with the fact that k 0 | k | k 0 | k | k^(0)!=| vec(k)|k^{0} \neq|\vec{k}|k0|k| for a virtual particle, but won't be important to us here and so will be dropped.) We therefore consider the amplitude
(47.6) A = J a μ ( η μ ν k 2 ) J b ν (47.6) A = J a μ η μ ν k 2 J b ν {:(47.6)A=-J_(a)^(mu)((eta_(mu nu))/(k^(2)))J_(b)^(nu):}\begin{equation*} \mathcal{A}=-J_{a}^{\mu}\left(\frac{\eta_{\mu \nu}}{\boldsymbol{k}^{2}}\right) J_{b}^{\nu} \tag{47.6} \end{equation*}(47.6)A=Jaμ(ημνk2)Jbν
We shall use this to discuss the nature of the interaction and the polarization states of the photon.
Example 47.2
If we work in a frame where k μ = ( k 0 , 0 , 0 , k 3 ) k μ = k 0 , 0 , 0 , k 3 k^(mu)=(k^(0),0,0,k^(3))k^{\mu}=\left(k^{0}, 0,0, k^{3}\right)kμ=(k0,0,0,k3), then the amplitude looks like
(47.7) A = ( J a 0 J b 0 + J a J b ) ( k 0 ) 2 + ( k 3 ) 2 (47.7) A = J a 0 J b 0 + J a J b k 0 2 + k 3 2 {:(47.7)A=-((-J_(a)^(0)J_(b)^(0)+ vec(J)_(a)* vec(J)_(b)))/(-(k^(0))^(2)+(k^(3))^(2)):}\begin{equation*} \mathcal{A}=-\frac{\left(-J_{\mathrm{a}}^{0} J_{\mathrm{b}}^{0}+\vec{J}_{\mathrm{a}} \cdot \vec{J}_{\mathrm{b}}\right)}{-\left(k^{0}\right)^{2}+\left(k^{3}\right)^{2}} \tag{47.7} \end{equation*}(47.7)A=(Ja0Jb0+JaJb)(k0)2+(k3)2
We can immediately reduce the number of components using the momentum-space representation of the conservation of current J = 0 J = 0 grad*J=0\boldsymbol{\nabla} \cdot \boldsymbol{J}=0J=0, which implies that k μ J μ = 0 k μ J μ = 0 k_(mu)J^(mu)=0k_{\mu} J^{\mu}=0kμJμ=0. We then have k 0 J 0 + k 3 J 3 = 0 k 0 J 0 + k 3 J 3 = 0 -k_(0)J^(0)+k_(3)J^(3)=0-k_{0} J^{0}+k_{3} J^{3}=0k0J0+k3J3=0, which allows us to eliminate the component J 3 J 3 J^(3)J^{3}J3 to yield
(47.8) A = J a 0 J b 0 ( k 3 ) 2 J a 1 J b 1 + J a 2 J b 2 ( k 0 ) 2 + ( k 3 ) 2 . (47.8) A = J a 0 J b 0 k 3 2 J a 1 J b 1 + J a 2 J b 2 k 0 2 + k 3 2 . {:(47.8)A=(J_(a)^(0)J_(b)^(0))/((k^(3))^(2))-(J_(a)^(1)J_(b)^(1)+J_(a)^(2)J_(b)^(2))/(-(k^(0))^(2)+(k^(3))^(2)).:}\begin{equation*} \mathcal{A}=\frac{J_{\mathrm{a}}^{0} J_{\mathrm{b}}^{0}}{\left(k^{3}\right)^{2}}-\frac{J_{\mathrm{a}}^{1} J_{\mathrm{b}}^{1}+J_{\mathrm{a}}^{2} J_{\mathrm{b}}^{2}}{-\left(k^{0}\right)^{2}+\left(k^{3}\right)^{2}} . \tag{47.8} \end{equation*}(47.8)A=Ja0Jb0(k3)2Ja1Jb1+Ja2Jb2(k0)2+(k3)2.
Now for some interpretation. The first term is written in terms of J 0 = ρ J 0 = ρ J^(0)=rhoJ^{0}=\rhoJ0=ρ, the electromagnetic charge density. If we (inverse) Fourier transform this quantity we obtain an instantaneously acting Coulomb potential, which is repulsive between like charges
(47.9) d 4 k ( 2 π ) 4 e i k x J a 0 J b 0 ( k 3 ) 2 q 2 4 π | r | δ ( t a t b ) (47.9) d 4 k ( 2 π ) 4 e i k x J a 0 J b 0 k 3 2 q 2 4 π | r | δ t a t b {:(47.9)int(d^(4)k)/((2pi)^(4))e^(-ik*x)(J_(a)^(0)J_(b)^(0))/((k^(3))^(2))prop(q^(2))/(4pi|( vec(r))|)delta(t_(a)-t_(b)):}\begin{equation*} \int \frac{\mathrm{d}^{4} k}{(2 \pi)^{4}} \mathrm{e}^{-\mathrm{i} \boldsymbol{k} \cdot x} \frac{J_{\mathrm{a}}^{0} J_{\mathrm{b}}^{0}}{\left(k^{3}\right)^{2}} \propto \frac{q^{2}}{4 \pi|\vec{r}|} \delta\left(t_{a}-t_{b}\right) \tag{47.9} \end{equation*}(47.9)d4k(2π)4eikxJa0Jb0(k3)2q24π|r|δ(tatb)
This only looks (unphysically) instantaneous because we've split up the propagator in a non-covariant manner. Moreover, this is the Coulomb term that dominates in the non-relativistic regime.
The part left over is retarded. That is, it depends on the finite time taken for the photon to propagate. For our case of photons propagating along the z z zzz - (or 3 -) direction, we look at the amplitude of the second term and we see that there seem to be two sorts of photon: those that couple J 1 J 1 J^(1)J^{1}J1 currents and those that couple J 2 J 2 J^(2)J^{2}J2 currents. These are the two physical transverse photon polarizations. To see this, we decompose the term ( J a 1 J b 1 + J a 2 J b 2 ) J a 1 J b 1 + J a 2 J b 2 (J_(a)^(1)J_(b)^(1)+J_(a)^(2)J_(b)^(2))\left(J_{a}^{1} J_{b}^{1}+J_{a}^{2} J_{b}^{2}\right)(Ja1Jb1+Ja2Jb2) into a different basis by writing
J a 1 J b 1 + J a 2 J b 2 = 1 2 ( J a 1 + i J a 2 ) 1 2 ( J b 1 + i J b 2 ) + 1 2 ( J a 1 i J a 2 ) 1 2 ( J b 1 i J b 2 ) . ( 47.10 ) J a 1 J b 1 + J a 2 J b 2 = 1 2 J a 1 + i J a 2 1 2 J b 1 + i J b 2 + 1 2 J a 1 i J a 2 1 2 J b 1 i J b 2 . ( 47.10 ) J_(a)^(1)J_(b)^(1)+J_(a)^(2)J_(b)^(2)=(1)/(sqrt2)(J_(a)^(1)+iJ_(a)^(2))(1)/(sqrt2)(J_(b)^(1)+iJ_(b)^(2))^(†)+(1)/(sqrt2)(J_(a)^(1)-iJ_(a)^(2))(1)/(sqrt2)(J_(b)^(1)-iJ_(b)^(2))^(†).(47.10)J_{a}^{1} J_{b}^{1}+J_{a}^{2} J_{b}^{2}=\frac{1}{\sqrt{2}}\left(J_{a}^{1}+\mathrm{i} J_{a}^{2}\right) \frac{1}{\sqrt{2}}\left(J_{b}^{1}+\mathrm{i} J_{b}^{2}\right)^{\dagger}+\frac{1}{\sqrt{2}}\left(J_{a}^{1}-\mathrm{i} J_{a}^{2}\right) \frac{1}{\sqrt{2}}\left(J_{b}^{1}-\mathrm{i} J_{b}^{2}\right)^{\dagger} .(47.10)Ja1Jb1+Ja2Jb2=12(Ja1+iJa2)12(Jb1+iJb2)+12(Ja1iJa2)12(Jb1iJb2).(47.10)
This implies that two sorts of photons interact: the J + i J J + i J J+iJJ+\mathrm{i} JJ+iJ sort and the J i J J i J J-iJJ-\mathrm{i} JJiJ sort. These are indeed the two possible polarizations for the photons, although they are circularly polarized here (see below), compared to the linearly polarized states discussed in the last chapter.
10 10 ^(10){ }^{10}10 The details of this equation won't be important to us, apart from the 1 / k 2 1 / k 2 1//k^(2)1 / \boldsymbol{k}^{2}1/k2 part.
If, in the last example, we use coordinates J a 1 = j cos ϕ J a 1 = j cos ϕ J_(a)^(1)=j cos phiJ_{a}^{1}=j \cos \phiJa1=jcosϕ and J a 2 = j sin ϕ J a 2 = j sin ϕ J_(a)^(2)=j sin phiJ_{a}^{2}=j \sin \phiJa2=jsinϕ, we see that the two polarization can be represented as j e i ϕ j e i ϕ je^(iphi)j \mathrm{e}^{\mathrm{i} \phi}jeiϕ and j e i ϕ j e i ϕ je^(-iphi)j \mathrm{e}^{-\mathrm{i} \phi}jeiϕ. We conclude that these are circularly polarized photons and, using the angular momentum operator L ^ z = i / ϕ L ^ z = i / ϕ hat(L)_(z)=idel//del phi\hat{L}_{z}=\mathrm{i} \partial / \partial \phiL^z=i/ϕ, they have spin ± 1 ± 1 +-1\pm 1±1 respectively.
This concludes a round-up of the properties of photons. Next, we turn to gravitons.

47.3 Graviton propagation and polarization

We examine the case of the graviton by following exactly the steps that we followed for the photon. By analogy with the electromagnetic case, we have that the gravitational field h ( x ) h ( x ) h(x)\boldsymbol{h}(x)h(x), which exists by virtue of a mass distribution being present, has components
(47.11) h μ ν ( k ) 1 k 2 T μ ν (47.11) h μ ν ( k ) 1 k 2 T μ ν {:(47.11)h_(mu nu)(k)prop(1)/(k^(2))T_(mu nu):}\begin{equation*} h_{\mu \nu}(\boldsymbol{k}) \propto \frac{1}{\boldsymbol{k}^{2}} T_{\mu \nu} \tag{47.11} \end{equation*}(47.11)hμν(k)1k2Tμν
The interaction of this field with another mass distribution can be examined by considering two distributions of mass-energy described by energy-momentum tensors T a T a T_(a)\boldsymbol{T}_{a}Ta and T b T b T_(b)\boldsymbol{T}_{b}Tb interacting via a propagator that reflects a massless gravity-carrying particle. Since T T T\boldsymbol{T}T is a second-rank tensor, we need to deal with the extra indices, so the analogous scattering amplitude is given by
(47.12) A = T a α β ( η α μ η β ν k 2 + i ε ) T b μ ν (47.12) A = T a α β η α μ η β ν k 2 + i ε T b μ ν {:(47.12)A=-T_(a)^(alpha beta)((eta_(alpha mu)eta_(beta nu))/(k^(2)+iepsi))T_(b)^(mu nu):}\begin{equation*} \mathcal{A}=-T_{a}^{\alpha \beta}\left(\frac{\eta_{\alpha \mu} \eta_{\beta \nu}}{\boldsymbol{k}^{2}+\mathrm{i} \varepsilon}\right) T_{b}^{\mu \nu} \tag{47.12} \end{equation*}(47.12)A=Taαβ(ηαμηβνk2+iε)Tbμν
In the same way that the field A ( x ) A ( x ) A(x)\boldsymbol{A}(x)A(x) interacts with current J J J\boldsymbol{J}J via an interaction term L = A μ J μ L = A μ J μ L=A_(mu)J^(mu)\mathcal{L}=A_{\mu} J^{\mu}L=AμJμ, we expect the interaction of the gravitational field h h h\boldsymbol{h}h and the energy-momentum T T T\boldsymbol{T}T to take the form
(47.13) L = h μ ν T μ ν , (47.13) L = h μ ν T μ ν , {:(47.13)L=h_(mu nu)T^(mu nu)",":}\begin{equation*} \mathcal{L}=h_{\mu \nu} T^{\mu \nu}, \tag{47.13} \end{equation*}(47.13)L=hμνTμν,
where h μ ν h μ ν h_(mu nu)h_{\mu \nu}hμν are the components of the weak-field tensor h ( x ) h ( x ) h(x)\boldsymbol{h}(x)h(x).
Newton's law can be expressed in terms of the instantaneous part of this interaction. Since we know that T 00 T 00 T^(00)T^{00}T00 represents ρ ρ rho\rhoρ, the mass density, then we expect the part reflecting Newton's law to be
(47.14) T a 00 T b 00 ( k 3 ) 2 (47.14) T a 00 T b 00 k 3 2 {:(47.14)-(T_(a)^(00)T_(b)^(00))/((k^(3))^(2)):}\begin{equation*} -\frac{T_{a}^{00} T_{b}^{00}}{\left(k^{3}\right)^{2}} \tag{47.14} \end{equation*}(47.14)Ta00Tb00(k3)2
where the sign ensures gravity is attractive. This equation is the (inverse) Fourier transform of the Newtonian potential energy. The retarded term then gives us the information on the graviton polarizations. We therefore expand out the amplitude and find
T a α β η α μ η β ν k 2 T b μ ν = 1 ( k 0 ) 2 + ( k 3 ) 2 ( T a 00 T b 00 2 T a 03 T b 03 2 T a 02 T b 02 2 T a 01 T b 01 + 2 T a 23 T b 23 + 2 T a 31 T b 31 + 2 T a 21 T b 21 (47.15) + T a 33 T b 33 + T a 22 T b 22 + T a 11 T b 11 ) T a α β η α μ η β ν k 2 T b μ ν = 1 k 0 2 + k 3 2 T a 00 T b 00 2 T a 03 T b 03 2 T a 02 T b 02 2 T a 01 T b 01 + 2 T a 23 T b 23 + 2 T a 31 T b 31 + 2 T a 21 T b 21 (47.15) + T a 33 T b 33 + T a 22 T b 22 + T a 11 T b 11 {:[-T_(a)^(alpha beta)(eta_(alpha mu)eta_(beta nu))/(k^(2))T_(b)^(mu nu)=(-1)/(-(k^(0))^(2)+(k^(3))^(2))(T_(a)^(00)T_(b)^(00)-2T_(a)^(03)T_(b)^(03)-2T_(a)^(02)T_(b)^(02):}],[-2T_(a)^(01)T_(b)^(01)+2T_(a)^(23)T_(b)^(23)+2T_(a)^(31)T_(b)^(31)+2T_(a)^(21)T_(b)^(21)],[(47.15){:+T_(a)^(33)T_(b)^(33)+T_(a)^(22)T_(b)^(22)+T_(a)^(11)T_(b)^(11))]:}\begin{align*} -T_{a}^{\alpha \beta} \frac{\eta_{\alpha \mu} \eta_{\beta \nu}}{k^{2}} T_{b}^{\mu \nu}= & \frac{-1}{-\left(k^{0}\right)^{2}+\left(k^{3}\right)^{2}}\left(T_{a}^{00} T_{b}^{00}-2 T_{a}^{03} T_{b}^{03}-2 T_{a}^{02} T_{b}^{02}\right. \\ & -2 T_{a}^{01} T_{b}^{01}+2 T_{a}^{23} T_{b}^{23}+2 T_{a}^{31} T_{b}^{31}+2 T_{a}^{21} T_{b}^{21} \\ & \left.+T_{a}^{33} T_{b}^{33}+T_{a}^{22} T_{b}^{22}+T_{a}^{11} T_{b}^{11}\right) \tag{47.15} \end{align*}Taαβηαμηβνk2Tbμν=1(k0)2+(k3)2(Ta00Tb002Ta03Tb032Ta02Tb022Ta01Tb01+2Ta23Tb23+2Ta31Tb31+2Ta21Tb21(47.15)+Ta33Tb33+Ta22Tb22+Ta11Tb11)
One complication in dealing with a second-rank tensor T T T\boldsymbol{T}T is that it carries around an invariant (i.e. scalar and therefore S = 0 S = 0 S=0S=0S=0 ) part: its trace T T TTT. The consequence of this is that when we split up the amplitude A A A\mathcal{A}A in terms of the components of T T T\boldsymbol{T}T, it can erroneously appear that the graviton has three polarizations, rather than the two it must have. The remedy is to use this trace part by adding to our amplitude A = A = A=\mathcal{A}=A= T μ ν ( 1 / k 2 ) T μ ν T μ ν 1 / k 2 T μ ν T_(mu nu)^(')(1//k^(2))T^(mu nu)T_{\mu \nu}^{\prime}\left(1 / \boldsymbol{k}^{2}\right) T^{\mu \nu}Tμν(1/k2)Tμν a multiple of the trace-part
(47.16) α T ( 1 k 2 ) T (47.16) α T 1 k 2 T {:(47.16)alphaT^(')((1)/(k^(2)))T:}\begin{equation*} \alpha T^{\prime}\left(\frac{1}{\boldsymbol{k}^{2}}\right) T \tag{47.16} \end{equation*}(47.16)αT(1k2)T
where T T TTT and T T T^(')T^{\prime}T denote traces. Here α α alpha\alphaα is a constant that we are free to choose in order that we cancel off any illusory S = 0 S = 0 S=0S=0S=0 part and leave only S = 2 S = 2 S=2S=2S=2 gravitons. Let's set about decomposing the amplitude.

Example 47.3

As in the photon case, we use the momentum-space version of conservation of massenergy, T = 0 T = 0 grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0T=0, to write
which gives us
(47.17) k μ T b μ ν = 0 (47.17) k μ T b μ ν = 0 {:(47.17)k_(mu)T_(b)^(mu nu)=0:}\begin{equation*} k_{\mu} T_{b}^{\mu \nu}=0 \tag{47.17} \end{equation*}(47.17)kμTbμν=0
(47.18) k 0 T b 0 ν = k 3 T b 3 ν (47.18) k 0 T b 0 ν = k 3 T b 3 ν {:(47.18)k_(0)T_(b)^(0nu)=k_(3)T_(b)^(3nu):}\begin{equation*} k_{0} T_{b}^{0 \nu}=k_{3} T_{b}^{3 \nu} \tag{47.18} \end{equation*}(47.18)k0Tb0ν=k3Tb3ν
Using this to eliminate T 3 ν T 3 ν T^(3nu)T^{3 \nu}T3ν we find that the instantaneous part becomes
(47.19) 1 ( k 3 ) 2 [ T a 00 T b 00 ( 1 ( k 0 ) 2 ( k 3 ) 2 ) 2 T a 01 T b 01 2 T a 02 T b 02 ] (47.19) 1 k 3 2 T a 00 T b 00 1 k 0 2 k 3 2 2 T a 01 T b 01 2 T a 02 T b 02 {:(47.19)-(1)/((k^(3))^(2))*[T_(a)^(00)T_(b)^(00)(1-((k^(0))^(2))/((k^(3))^(2)))-2T_(a)^(01)T_(b)^(01)-2T_(a)^(02)T_(b)^(02)]:}\begin{equation*} -\frac{1}{\left(k^{3}\right)^{2}} \cdot\left[T_{a}^{00} T_{b}^{00}\left(1-\frac{\left(k^{0}\right)^{2}}{\left(k^{3}\right)^{2}}\right)-2 T_{a}^{01} T_{b}^{01}-2 T_{a}^{02} T_{b}^{02}\right] \tag{47.19} \end{equation*}(47.19)1(k3)2[Ta00Tb00(1(k0)2(k3)2)2Ta01Tb012Ta02Tb02]
The retarded term is then given by
(47.20) 1 ( k 0 ) 2 + ( k 3 ) 2 ( T a 11 T b 11 + T a 22 T b 22 + 2 T a 21 T b 21 ) (47.20) 1 k 0 2 + k 3 2 T a 11 T b 11 + T a 22 T b 22 + 2 T a 21 T b 21 {:(47.20)(-1)/(-(k^(0))^(2)+(k^(3))^(2))(T_(a)^(11)T_(b)^(11)+T_(a)^(22)T_(b)^(22)+2T_(a)^(21)T_(b)^(21)):}\begin{equation*} \frac{-1}{-\left(k^{0}\right)^{2}+\left(k^{3}\right)^{2}}\left(T_{a}^{11} T_{b}^{11}+T_{a}^{22} T_{b}^{22}+2 T_{a}^{21} T_{b}^{21}\right) \tag{47.20} \end{equation*}(47.20)1(k0)2+(k3)2(Ta11Tb11+Ta22Tb22+2Ta21Tb21)
To remove the S = 0 S = 0 S=0S=0S=0 contribution we add the term in eqn 47.16 , which contributes a piece to the retarded term of
(47.21) α 1 ( k 0 ) 2 + ( k 3 ) 2 ( T a 11 + T a 22 ) ( T b 11 + T b 22 ) . (47.21) α 1 k 0 2 + k 3 2 T a 11 + T a 22 T b 11 + T b 22 . {:(47.21)alpha(1)/(-(k^(0))^(2)+(k^(3))^(2))*(T_(a)^(11)+T_(a)^(22))(T_(b)^(11)+T_(b)^(22)).:}\begin{equation*} \alpha \frac{1}{-\left(k^{0}\right)^{2}+\left(k^{3}\right)^{2}} \cdot\left(T_{a}^{11}+T_{a}^{22}\right)\left(T_{b}^{11}+T_{b}^{22}\right) . \tag{47.21} \end{equation*}(47.21)α1(k0)2+(k3)2(Ta11+Ta22)(Tb11+Tb22).
Now we choose α α alpha\alphaα so that there are only two terms in the S = 2 S = 2 S=2S=2S=2 retarded part (reflecting the two polarizations). This is achieved by setting α = 1 / 2 α = 1 / 2 alpha=1//2\alpha=1 / 2α=1/2 and so we obtain
(47.22) 1 ( k 0 ) 2 + ( k 3 ) 2 [ 1 2 ( T a 11 T a 22 ) ( T b 11 T b 22 ) + 2 T a 12 T b 12 ] (47.22) 1 k 0 2 + k 3 2 1 2 T a 11 T a 22 T b 11 T b 22 + 2 T a 12 T b 12 {:(47.22)(-1)/(-(k^(0))^(2)+(k^(3))^(2))[(1)/(2)(T_(a)^(11)-T_(a)^(22))(T_(b)^(11)-T_(b)^(22))+2T_(a)^(12)T_(b)^(12)]:}\begin{equation*} \frac{-1}{-\left(k^{0}\right)^{2}+\left(k^{3}\right)^{2}}\left[\frac{1}{2}\left(T_{a}^{11}-T_{a}^{22}\right)\left(T_{b}^{11}-T_{b}^{22}\right)+2 T_{a}^{12} T_{b}^{12}\right] \tag{47.22} \end{equation*}(47.22)1(k0)2+(k3)2[12(Ta11Ta22)(Tb11Tb22)+2Ta12Tb12]
Since we can use the symmetry of T T T\boldsymbol{T}T to rewrite 2 T b 12 = ( T b 12 + T b 21 ) 2 T b 12 = T b 12 + T b 21 2T_(b)^(12)=(T_(b)^(12)+T_(b)^(21))2 T_{b}^{12}=\left(T_{b}^{12}+T_{b}^{21}\right)2Tb12=(Tb12+Tb21), we can write the retarded part of the amplitude as
1 ( k 0 ) 2 + ( k 3 ) 2 [ 1 2 ( T a 11 T a 22 ) ( T b 11 T b 22 ) + 1 2 ( T a 12 + T a 21 ) ( T b 12 + T b 21 ) ] 1 k 0 2 + k 3 2 1 2 T a 11 T a 22 T b 11 T b 22 + 1 2 T a 12 + T a 21 T b 12 + T b 21 (-1)/(-(k^(0))^(2)+(k^(3))^(2))[(1)/(2)(T_(a)^(11)-T_(a)^(22))(T_(b)^(11)-T_(b)^(22))+(1)/(2)(T_(a)^(12)+T_(a)^(21))(T_(b)^(12)+T_(b)^(21))]\frac{-1}{-\left(k^{0}\right)^{2}+\left(k^{3}\right)^{2}}\left[\frac{1}{2}\left(T_{a}^{11}-T_{a}^{22}\right)\left(T_{b}^{11}-T_{b}^{22}\right)+\frac{1}{2}\left(T_{a}^{12}+T_{a}^{21}\right)\left(T_{b}^{12}+T_{b}^{21}\right)\right]1(k0)2+(k3)2[12(Ta11Ta22)(Tb11Tb22)+12(Ta12+Ta21)(Tb12+Tb21)]
We conclude that there are two graviton polarizations: ( T b 11 T b 22 ) T b 11 T b 22 (T_(b)^(11)-T_(b)^(22))\left(T_{b}^{11}-T_{b}^{22}\right)(Tb11Tb22) and ( T b 12 + T b 21 ) T b 12 + T b 21 (T_(b)^(12)+T_(b)^(21))\left(T_{b}^{12}+T_{b}^{21}\right)(Tb12+Tb21).
The field required to generate these gravitons takes the form
(47.24) h μ ν ( x ) = A μ ν e i k x . (47.24) h μ ν ( x ) = A μ ν e i k x . {:(47.24)h^(mu nu)(x)=A^(mu nu)e^(ik*x).:}\begin{equation*} h^{\mu \nu}(x)=A^{\mu \nu} \mathrm{e}^{\mathrm{i} \boldsymbol{k} \cdot \boldsymbol{x}} . \tag{47.24} \end{equation*}(47.24)hμν(x)=Aμνeikx.
We can identify from the previous example that
(47.25) A 11 = 1 2 , A 22 = 1 2 , A 12 = A 21 = 1 2 . (47.25) A 11 = 1 2 , A 22 = 1 2 , A 12 = A 21 = 1 2 . {:(47.25)A^(11)=(1)/(sqrt2)","quad-A^(22)=(1)/(sqrt2)","quadA^(12)=A^(21)=(1)/(sqrt2).:}\begin{equation*} A^{11}=\frac{1}{\sqrt{2}}, \quad-A^{22}=\frac{1}{\sqrt{2}}, \quad A^{12}=A^{21}=\frac{1}{\sqrt{2}} . \tag{47.25} \end{equation*}(47.25)A11=12,A22=12,A12=A21=12.
11 11 ^(11){ }^{11}11 This is examined further in the exercises.
12 12 ^(12){ }^{12}12 The problem here is related to renormalization. Calculations of renormalization. theory in QFT often lead to infinities. theory in QFT often lead to infinities.
Unlike the case in some other QFTs, Unlike the case in some other QFTs, the infinities in gravitational perturba-
tion theory, encountered at higher ortion theory, encountered at higher orders of scattering, cannot be removed by the set of techniques, known as renormalization, that have proved successful in removing infinities in theories like quantum electrodynamics or quantum chromodynamics. For example, if we compute the amplitude for gravitons to scatter from gravitons at energy E E EEE, perturbation theory predicts the amplitude is given by a series of the form G [ 1 + G E 2 + ( G E 2 ) 2 + ] G 1 + G E 2 + G E 2 2 + G[1+GE^(2)+(GE^(2))^(2)+dots]G\left[1+G E^{2}+\left(G E^{2}\right)^{2}+\ldots\right]G[1+GE2+(GE2)2+], where G G GGG is the gravitational constant. Once the energy scale E E EEE reaches G 1 2 G 1 2 G^(-(1)/(2))G^{-\frac{1}{2}}G12 this approach fails as the series diverges. (By analogy with the closely related Fermi theory of the weak interaction we might expect some new physics to appear at this energy scale.) The infinities encountered in this approach are avoided to an extent in string theory, which follows a similar route to QFT and is described in Chapter 49. See Zee's Quantum Field Theory in a Nut shell (2003) for a discussion of the analogy between graviton-graviton scattering and the Fermi theory.
How do we know the expressions identified in the last example are the correct polarizations for an S = 2 S = 2 S=2S=2S=2 field? For circularly polarized gravitons, we must be able to shift to a coordinate system where the phases behaves as e 2 i θ e 2 i θ e^(2i theta)\mathrm{e}^{2 i \theta}e2iθ and e 2 i θ e 2 i θ e^(-2i theta)\mathrm{e}^{-2 i \theta}e2iθ. As can be checked, it is possible to rewrite the polarization part of the retarded term as
1 4 ( T a 11 T a 22 + 2 i T a 12 ) ( T b 11 T b 22 2 i T b 12 ) (47.26) + 1 4 ( T a 11 T a 22 2 i T a 12 ) ( T b 11 T b 22 + 2 i T b 12 ) . 1 4 T a 11 T a 22 + 2 i T a 12 T b 11 T b 22 2 i T b 12 (47.26) + 1 4 T a 11 T a 22 2 i T a 12 T b 11 T b 22 + 2 i T b 12 . {:[(1)/(4)(T_(a)^(11)-T_(a)^(22)+2iT_(a)^(12))(T_(b)^(11)-T_(b)^(22)-2iT_(b)^(12))],[(47.26)+(1)/(4)(T_(a)^(11)-T_(a)^(22)-2iT_(a)^(12))(T_(b)^(11)-T_(b)^(22)+2iT_(b)^(12)).]:}\begin{align*} & \frac{1}{4}\left(T_{a}^{11}-T_{a}^{22}+2 \mathrm{i} T_{a}^{12}\right)\left(T_{b}^{11}-T_{b}^{22}-2 \mathrm{i} T_{b}^{12}\right) \\ & +\frac{1}{4}\left(T_{a}^{11}-T_{a}^{22}-2 \mathrm{i} T_{a}^{12}\right)\left(T_{b}^{11}-T_{b}^{22}+2 \mathrm{i} T_{b}^{12}\right) . \tag{47.26} \end{align*}14(Ta11Ta22+2iTa12)(Tb11Tb222iTb12)(47.26)+14(Ta11Ta222iTa12)(Tb11Tb22+2iTb12).
Each bracket has the form ( x x y y ± 2 i x y ) ( x x y y ± 2 i x y ) (xx-yy+-2ixy)(x x-y y \pm 2 \mathrm{i} x y)(xxyy±2ixy), which is equivalent to ( x ± i y ) ( x ± i y ) ( x ± i y ) ( x ± i y ) (x+-iy)(x+-iy)(x \pm \mathrm{i} y)(x \pm \mathrm{i} y)(x±iy)(x±iy). Since each of the bracketed terms in this last expression can be represented as a phase e ± i θ e ± i θ e^(+-i theta)\mathrm{e}^{ \pm i \theta}e±iθ, their product has the required e ± 2 i θ e ± 2 i θ e^(+-2i theta)\mathrm{e}^{ \pm 2 i \theta}e±2iθ phase. 11 11 ^(11){ }^{11}11

Example 47.4

From the last example we see that the amplitude can be written as
(47.27) A = ( T a ) μ ν T b μ ν + 1 2 ( T a ) μ μ ( T b ) ν ν k 2 . (47.27) A = T a μ ν T b μ ν + 1 2 T a μ μ T b ν ν k 2 . {:(47.27)A=(-(T_(a))_(mu nu)T_(b)^(mu nu)+(1)/(2)(T_(a))^(mu)_(mu)(T_(b))^(nu)_(nu))/(k^(2)).:}\begin{equation*} \mathcal{A}=\frac{-\left(T_{a}\right)_{\mu \nu} T_{b}^{\mu \nu}+\frac{1}{2}\left(T_{a}\right)^{\mu}{ }_{\mu}\left(T_{b}\right)^{\nu}{ }_{\nu}}{\boldsymbol{k}^{2}} . \tag{47.27} \end{equation*}(47.27)A=(Ta)μνTbμν+12(Ta)μμ(Tb)ννk2.
We can then spot two things. This first is that the amplitude A A A\mathcal{A}A can be rewritten in terms of the graviton propagator as A = T σ τ D σ τ μ ν T μ ν A = T σ τ D σ τ μ ν T μ ν A=T^('sigma tau)D_(sigma tau mu nu)T^(mu nu)\mathcal{A}=T^{\prime \sigma \tau} D_{\sigma \tau \mu \nu} T^{\mu \nu}A=TστDστμνTμν from which we find an expression for the propagator of
(47.28) D σ τ μ ν = 1 2 ( η μ σ η ν τ + η μ τ η ν σ η μ ν η σ τ ) k 2 (47.28) D σ τ μ ν = 1 2 η μ σ η ν τ + η μ τ η ν σ η μ ν η σ τ k 2 {:(47.28)D_(sigma tau mu nu)=-(1)/(2)((eta_(mu sigma)eta_(nu tau)+eta_(mu tau)eta_(nu sigma)-eta_(mu nu)eta_(sigma tau)))/(k^(2)):}\begin{equation*} D_{\sigma \tau \mu \nu}=-\frac{1}{2} \frac{\left(\eta_{\mu \sigma} \eta_{\nu \tau}+\eta_{\mu \tau} \eta_{\nu \sigma}-\eta_{\mu \nu} \eta_{\sigma \tau}\right)}{k^{2}} \tag{47.28} \end{equation*}(47.28)Dστμν=12(ημσηντ+ημτηνσημνηστ)k2
Since the amplitude A A A\mathcal{A}A is also proportional to h μ ν T a μ ν h μ ν T a μ ν h_(mu nu)T_(a)^(mu nu)h_{\mu \nu} T_{a}^{\mu \nu}hμνTaμν, we conclude that amplitude of gravitons emitted from a source can be written in terms of a field as
(47.29) h μ ν ( k ) 1 k 2 ( T μ ν 1 2 η μ ν T ) . (47.29) h μ ν ( k ) 1 k 2 T μ ν 1 2 η μ ν T . {:(47.29)h_(mu nu)(k)prop(1)/(k^(2))(T_(mu nu)-(1)/(2)eta_(mu nu)T).:}\begin{equation*} h_{\mu \nu}(\boldsymbol{k}) \propto \frac{1}{\boldsymbol{k}^{2}}\left(T_{\mu \nu}-\frac{1}{2} \eta_{\mu \nu} T\right) . \tag{47.29} \end{equation*}(47.29)hμν(k)1k2(Tμν12ημνT).
The part in brackets is, of course, familiar from the Einstein equation, so it is heartening to see it appear from this completely different approach. In fact, this equation is recognizable as the momentum-space version of the weak-field Einstein equation.
Although we might feel pleased with the progress made in this chapter, it's sobering to remember that nobody has yet quantized gravity consistently. The approach suggested here, which has perturbation theory at its root, has been attempted several times, but does not lead to a consistent theory. 12 12 ^(12){ }^{12}12 A successful quantum field theory of gravity might still be expected to result in a prediction of the graviton excitations with properties something like those that we have discussed here. (However, there is nothing to guarantee this.) In Chapter 49, we shall look at some possible avenues for this project of finding quantum gravity. In order to get there, we shall need to make some more room in spacetime by considering the possibility of spacetime with more than (3+1) dimensions, which is our next subject.

Chapter summary

  • The graviton is a force-carrying particle with spin S = 2 S = 2 S=2S=2S=2 and two polarizations.
  • The graviton has spin S = 2 S = 2 S=2S=2S=2 because its source is a second-rank tensor T μ ν T μ ν T^(mu nu)T^{\mu \nu}Tμν; the photon has spin S = 1 S = 1 S=1S=1S=1 because its source is a first-rank tensor J μ J μ J^(mu)J^{\mu}Jμ.
  • In scattering theory, the gravitational interaction between masses can be written in terms of an amplitude as
(47.30) A = T a α β ( η α μ η β ν k 2 + i ε ) T b μ ν . (47.30) A = T a α β η α μ η β ν k 2 + i ε T b μ ν . {:(47.30)A=-T_(a)^(alpha beta)((eta_(alpha mu)eta_(beta nu))/(k^(2)+iepsi))T_(b)^(mu nu).:}\begin{equation*} \mathcal{A}=-T_{a}^{\alpha \beta}\left(\frac{\eta_{\alpha \mu} \eta_{\beta \nu}}{\boldsymbol{k}^{2}+\mathrm{i} \varepsilon}\right) T_{b}^{\mu \nu} . \tag{47.30} \end{equation*}(47.30)A=Taαβ(ηαμηβνk2+iε)Tbμν.
  • The perturbative scattering approach fails to describe gravitation at higher orders of perturbation theory.

Exercises

(47.1) Verify that eqn 47.26 is equivalent to the retarded term in the graviton amplitude.
(47.2) The polarization vectors of a spin-1 particle change according to ( ϵ μ ) i = R i j ( θ ) ( ϵ μ ) j ϵ μ i = R i j ( θ ) ϵ μ j (epsilon_(mu)^('))^(i)=R^(i)_(j)(theta)(epsilon_(mu))^(j)\left(\boldsymbol{\epsilon}_{\mu}^{\prime}\right)^{i}=R^{i}{ }_{j}(\theta)\left(\boldsymbol{\epsilon}_{\mu}\right)^{j}(ϵμ)i=Rij(θ)(ϵμ)j, where the rotation matrix is given by
(47.31) R ( θ ) = ( cos θ sin θ 0 sin θ cos θ 0 0 0 1 ) (47.31) R ( θ ) = cos θ sin θ 0 sin θ cos θ 0 0 0 1 {:(47.31)R(theta)=([cos theta,sin theta,0],[-sin theta,cos theta,0],[0,0,1]):}R(\theta)=\left(\begin{array}{ccc} \cos \theta & \sin \theta & 0 \tag{47.31}\\ -\sin \theta & \cos \theta & 0 \\ 0 & 0 & 1 \end{array}\right)(47.31)R(θ)=(cosθsinθ0sinθcosθ0001)
Find combinations of the linear polarization vectors that obey the transformation law
(47.32) R j i ( θ ) ( ϵ h ) j = e i h θ ( ϵ h ) i (47.32) R j i ( θ ) ϵ h j = e i h θ ϵ h i {:(47.32)R_(j)^(i)(theta)(epsilon_(h))^(j)=e^(ih theta)(epsilon_(h)^('))^(i):}\begin{equation*} R_{j}^{i}(\theta)\left(\boldsymbol{\epsilon}_{h}\right)^{j}=\mathrm{e}^{\mathrm{i} h \theta}\left(\boldsymbol{\epsilon}_{h}^{\prime}\right)^{i} \tag{47.32} \end{equation*}(47.32)Rji(θ)(ϵh)j=eihθ(ϵh)i
for helicities h = 1 , 0 h = 1 , 0 h=1,0h=1,0h=1,0 and -1 .
For photons, only the h = 1 h = 1 h=1h=1h=1 and h = 1 h = 1 h=-1h=-1h=1 helicities are found in Nature.
(47.3) For gravitons, polarizations are expressed as ( 0 , 2 ) ( 0 , 2 ) (0,2)(0,2)(0,2) tensors ϵ ϵ epsilon\boldsymbol{\epsilon}ϵ with components ϵ i j ϵ i j epsilon_(ij)\epsilon_{i j}ϵij that transform as
(47.33) ( ϵ μ ) i j = R i m ( θ ) R j n ( θ ) ( ϵ μ ) m n , (47.33) ϵ μ i j = R i m ( θ ) R j n ( θ ) ϵ μ m n , {:(47.33)(epsilon_(mu)^('))_(ij)=R_(i)^(m)(theta)R_(j)^(n)(theta)(epsilon_(mu))_(mn)",":}\begin{equation*} \left(\epsilon_{\mu}^{\prime}\right)_{i j}=R_{i}^{m}(\theta) R_{j}^{n}(\theta)\left(\epsilon_{\mu}\right)_{m n}, \tag{47.33} \end{equation*}(47.33)(ϵμ)ij=Rim(θ)Rjn(θ)(ϵμ)mn,
where the rotation matrix is the same as in the last question. (Note the slightly awkward ordering of the components here. We take R i j = R i j R i j = R i j R^(i)_(j)=R_(i)^(j)R^{i}{ }_{j}=R_{i}{ }^{j}Rij=Rij
in any case.) For a graviton travelling along z z zzz, we found in the previous chapter that the only nonzero components are ϵ 11 = ϵ 22 ϵ 11 = ϵ 22 epsilon_(11)=-epsilon_(22)\epsilon_{11}=-\epsilon_{22}ϵ11=ϵ22 and ϵ 12 = ϵ 21 ϵ 12 = ϵ 21 epsilon_(12)=epsilon_(21)\epsilon_{12}=\epsilon_{21}ϵ12=ϵ21.
(a) Show that under the transformation we obtain
( ϵ ) 11 = ( cos 2 θ sin 2 θ ) ϵ 11 + 2 sin θ cos θ ϵ 12 ( ϵ ) 12 = 2 sin θ cos θ ϵ 11 + ( cos 2 θ sin 2 θ ) ϵ 12 ϵ 11 = cos 2 θ sin 2 θ ϵ 11 + 2 sin θ cos θ ϵ 12 ϵ 12 = 2 sin θ cos θ ϵ 11 + cos 2 θ sin 2 θ ϵ 12 {:[(epsilon^('))_(11)=(cos^(2)theta-sin^(2)theta)epsilon_(11)+2sin theta cos thetaepsilon_(12)],[(epsilon^('))_(12)=-2sin theta cos thetaepsilon_(11)+(cos^(2)theta-sin^(2)theta)epsilon_(12)]:}\begin{aligned} & \left(\epsilon^{\prime}\right)_{11}=\left(\cos ^{2} \theta-\sin ^{2} \theta\right) \epsilon_{11}+2 \sin \theta \cos \theta \epsilon_{12} \\ & \left(\epsilon^{\prime}\right)_{12}=-2 \sin \theta \cos \theta \epsilon_{11}+\left(\cos ^{2} \theta-\sin ^{2} \theta\right) \epsilon_{12} \end{aligned}(ϵ)11=(cos2θsin2θ)ϵ11+2sinθcosθϵ12(ϵ)12=2sinθcosθϵ11+(cos2θsin2θ)ϵ12
( 47.34 ) ( 47.34 ) (47.34)(47.34)(47.34)
(b) Find linear combinations of the polarization tensors that yield the two polarization states of the graviton with h = ± 2 h = ± 2 h=+-2h= \pm 2h=±2.
(47.4) (a) Show that the gravitational wave power L L LLL from a binary system of two masses, m 1 m 1 m_(1)m_{1}m1 and m 2 m 2 m_(2)m_{2}m2, separated by distance a a aaa, and in a circular orbit about their centre of mass, is given by
(47.35) L = 32 G 4 m 1 2 m 2 2 ( m 1 + m 2 ) 5 c 5 a 5 (47.35) L = 32 G 4 m 1 2 m 2 2 m 1 + m 2 5 c 5 a 5 {:(47.35)L=(32G^(4)m_(1)^(2)m_(2)^(2)(m_(1)+m_(2)))/(5c^(5)a^(5)):}\begin{equation*} L=\frac{32 G^{4} m_{1}^{2} m_{2}^{2}\left(m_{1}+m_{2}\right)}{5 c^{5} a^{5}} \tag{47.35} \end{equation*}(47.35)L=32G4m12m22(m1+m2)5c5a5
Estimate this quantity for the Earth-Sun system. Also, estimate the number of gravitons emitted per second.
(b) Estimate how long it would take you to emit a single graviton by frantically waving your arms around in the air.

48
Higher dimensional spacetime

48.1 Gauge transformations five dimensions in 521
48.2 Unifying electromagnetism and gravitation
Chap ter summary 525
Exer cises 525
48.1 Gauge transformations five dimensions in 521 48.2 Unifying electromagnetism and gravitation Chap ter summary 525 Exer cises 525| 48.1 | Gauge transformations five dimensions | in 521 | | :---: | :---: | :---: | | 48.2 | Unifying electromagnetism and gravitation | | | | | | | Chap | ter summary | 525 | | Exer | cises | 525 |
Exercises
1 1 ^(1){ }^{1}1 Theodor Kaluza (1885-1954). It is said that he taught himself to swim by reading a book, resulting in him successfully swimming on his first attempt.
2 2 ^(2){ }^{2}2 We follow the approach of Zee in this chapter. Kaluza's theory was rediscovered and developed by Oskar Klein (1894-1977) in 1926, who provided a quantum mechanical description of the theory. Gunnar Nordström had also independently developed a related theory before Kaluza.
3 3 ^(3){ }^{3}3 Max Planck (1858-1947) was an early champion of special relativity, extending the theory by formulating the relativistic action. Planck and Einstein were close friends who would meet to play music together. In his biography of Einstein, Abraham Pais notes phy of Einstein, Abraham Pais notes
Einstein's profound respect for Planck, Einstein's profound respect for Planck,
both as a scientist and as a deeply prinboth as a scientist and
cipled human being.
4 4 ^(4){ }^{4}4 Remember the conceptual equation
( g ) + ( 2 g ) ( R ( g ¯ ) + 2 g ¯ ( R ( bar(del g))+( bar(del^(2)g))rarr(R(\overline{\partial g})+\left(\overline{\partial^{2} g}\right) \rightarrow(R(g)+(2g)(R
The idea that this can be achieved through a five dimensional cylinder-world has never occurred to me and would seem to be altogether new. I like your idea at first sight very much. Albert Einstein, letter to Theodor Kaluza (1919)
General relativity presents us with a classical field theory of gravitation expressed using the tools of geometry. In Chapter 42, we met the classical field theory of electromagnetism expressed in similar geometric language. It's natural to ask whether gravitation and electromagnetism can be combined in such a way that they naturally arise as different facets of some master theory. This is the project of unification which, in a broader, modern sense, involves a combination of gravity and the standard model of particle physics. In this chapter, we examine an attempt, originally made by Theodor Kaluza 1 1 ^(1){ }^{1}1 in 1919, to use the gauge structure of electromagnetism and gravity to combine these interactions. The solution, known as Kaluza-Klein theory, 2 2 ^(2){ }^{2}2 involves adding an extra spatial dimension to spacetime.
Since, in this chapter, we shall be comparing theories in different numbers of dimensions, it will be helpful to make the action of our gravitation theory dimensionless. We do this by employing some dimensional analysis. We use units where c = 1 c = 1 c=1c=1c=1 and also where Planck's constant 3 = 1 3 = 1 ^(3)ℏ=1{ }^{3} \hbar=13=1. In such units, a mass has units of 1 / 1 / 1//1 /1/ (length) or 1 / L 1 / L 1//L1 / L1/L. The EinsteinHilbert action was previously written as
(48.1) S EH = d 4 x g R ( g ) , (48.1) S EH = d 4 x g R ( g ) , {:(48.1)S_(EH)=intd^(4)xsqrt(-g)R(g)",":}\begin{equation*} S_{\mathrm{EH}}=\int \mathrm{d}^{4} x \sqrt{-g} R(\boldsymbol{g}), \tag{48.1} \end{equation*}(48.1)SEH=d4xgR(g),
where we write the Ricci scalar as R ( g ) R ( g ) R(g)R(\boldsymbol{g})R(g) to remind us that it is derived from the first and second derivatives of the components of the metric tensor. 4 4 ^(4){ }^{4}4 The components of the metric tensor g g g\boldsymbol{g}g are dimensionless. The Ricci scalar R R RRR involves two derivatives of the metric and therefore carries units of 1 / L 2 1 / L 2 1//L^(2)1 / L^{2}1/L2. As a result, S E H S E H S_(EH)S_{E H}SEH with its four contributions of length from d 4 x d 4 x d^(4)x\mathrm{d}^{4} xd4x, has units L 2 L 2 L^(2)L^{2}L2. In order to make it dimensionless, we multiply by two powers of mass m P m P m_(P)m_{\mathrm{P}}mP with the result that
(48.2) S EH = d 4 x g m P 2 R ( g ) . (48.2) S EH = d 4 x g m P 2 R ( g ) . {:(48.2)S_(EH)=intd^(4)xsqrt(-g)m_(P)^(2)R(g).:}\begin{equation*} S_{\mathrm{EH}}=\int \mathrm{d}^{4} x \sqrt{-g} m_{\mathrm{P}}^{2} R(\boldsymbol{g}) . \tag{48.2} \end{equation*}(48.2)SEH=d4xgmP2R(g).
The mass we choose sets the scale of gravitational interactions and is known as the Planck mass. We shall discuss this quantity further in Chapter 49.

48.1 Gauge transformations in five dimensions

Kaluza's scheme for unifying electromagnetism and gravity can be understood by (once again) comparing the structure of the gauge transformations in electromagnetism and in gravitation. We saw in Chapter 42 that the gauge transformation in electromagnetism 5 5 ^(5){ }^{5}5 can be written in terms of the components of the electromagnetic 1-form A ~ A ~ tilde(A)\tilde{\boldsymbol{A}}A~ as
(48.3) A μ A μ χ , μ (48.3) A μ A μ χ , μ {:(48.3)A_(mu)rarrA_(mu)-chi","mu:}\begin{equation*} A_{\mu} \rightarrow A_{\mu}-\chi, \mu \tag{48.3} \end{equation*}(48.3)AμAμχ,μ
This transformation has no effect on the Faraday 2 -form F ~ = d A ~ F ~ = d A ~ tilde(F)=d tilde(A)\tilde{\boldsymbol{F}}=\boldsymbol{d} \tilde{\boldsymbol{A}}F~=dA~ and hence on the underlying equations of motion of the electromagnetic fields.
From the point of view of Chapter 44, the field A ~ A ~ tilde(A)\tilde{\boldsymbol{A}}A~ was mandated by changes made in the internal phase variable θ ( P ) θ ( P ) + α ( P ) θ ( P ) θ ( P ) + α ( P ) theta(P)rarr theta(P)+alpha(P)\theta(\mathcal{P}) \rightarrow \theta(\mathcal{P})+\alpha(\mathcal{P})θ(P)θ(P)+α(P), where P P P\mathcal{P}P labels a point in space. This variable has its information stored in a bundle of fibres, one at each point P P P\mathcal{P}P, floating above the spacetime. In the weak-field limit of gravitation, we recall that the gauge structure is derived by considering invariance under a set of infinitesimal coordinate transformations x μ ( P ) x μ ( P ) + ξ μ ( P ) x μ ( P ) x μ ( P ) + ξ μ ( P ) x^(mu)(P)rarrx^(mu)(P)+xi^(mu)(P)x^{\mu}(\mathcal{P}) \rightarrow x^{\mu}(\mathcal{P})+\xi^{\mu}(\mathcal{P})xμ(P)xμ(P)+ξμ(P), which causes the components of the tensor h h h\boldsymbol{h}h to change according to
(48.4) h μ ν h μ ν ξ μ , ν ξ ν , μ (48.4) h μ ν h μ ν ξ μ , ν ξ ν , μ {:(48.4)h_(mu nu)rarrh_(mu nu)-xi_(mu,nu)-xi_(nu,mu):}\begin{equation*} h_{\mu \nu} \rightarrow h_{\mu \nu}-\xi_{\mu, \nu}-\xi_{\nu, \mu} \tag{48.4} \end{equation*}(48.4)hμνhμνξμ,νξν,μ
The key to unification is to treat the internal variable θ θ theta\thetaθ as describing a coordinate in spacetime. That is, we regard a coordinate that was previously stored in a fibre as now describing a position in spacetime. This allows us to combine the electromagnetic and gravitational gauge transformations in a manner where they all derive from a single set of infinitesimal coordinate transformations, and reveals the structure needed to unify the two interactions.
Kaluza's inspired idea was therefore to add to our four-dimensional coordinates x μ = ( x 0 , x 1 , x 2 , x 3 ) x μ = x 0 , x 1 , x 2 , x 3 x^(mu)=(x^(0),x^(1),x^(2),x^(3))x^{\mu}=\left(x^{0}, x^{1}, x^{2}, x^{3}\right)xμ=(x0,x1,x2,x3) a fifth coordinate x 5 x 5 x^(5)x^{5}x5 to form the fivedimensional coordinate set 6 X a = ( x 0 , x 1 , x 2 , x 3 , x 5 ) 6 X a = x 0 , x 1 , x 2 , x 3 , x 5 ^(6)X^(a)=(x^(0),x^(1),x^(2),x^(3),x^(5)){ }^{6} X^{a}=\left(x^{0}, x^{1}, x^{2}, x^{3}, x^{5}\right)6Xa=(x0,x1,x2,x3,x5). We now demand invariance under the infinitesimal coordinate transformation
(48.5) X a ( P ) X a ( P ) + ξ a ( P ) (48.5) X a ( P ) X a ( P ) + ξ a ( P ) {:(48.5)X^(a)(P)rarrX^(a)(P)+xi^(a)(P):}\begin{equation*} X^{a}(\mathcal{P}) \rightarrow X^{a}(\mathcal{P})+\xi^{a}(\mathcal{P}) \tag{48.5} \end{equation*}(48.5)Xa(P)Xa(P)+ξa(P)
In the new ( 4 + 1 ) ( 4 + 1 ) (4+1)(4+1)(4+1)-dimensional spacetime, we have a five-dimensional weak-field metric H H H\boldsymbol{H}H with components
(48.6) H a b = η a b + h a b , (48.6) H a b = η a b + h a b , {:(48.6)H_(ab)=eta_(ab)+h_(ab)",":}\begin{equation*} H_{a b}=\eta_{a b}+h_{a b}, \tag{48.6} \end{equation*}(48.6)Hab=ηab+hab,
where the five-dimensional version of the Minkowski metric has components η a b = diag ( 1 , 1 , 1 , 1 , 1 ) η a b = diag ( 1 , 1 , 1 , 1 , 1 ) eta_(ab)=diag(-1,1,1,1,1)\eta_{a b}=\operatorname{diag}(-1,1,1,1,1)ηab=diag(1,1,1,1,1). We have again the gauge transformation property of the metric components that h a b h a b ξ a , b ξ b , a h a b h a b ξ a , b ξ b , a h_(ab)rarrh_(ab)-xi_(a,b)-xi_(b,a)h_{a b} \rightarrow h_{a b}-\xi_{a, b}-\xi_{b, a}habhabξa,bξb,a.
We now use the fifth dimension to accommodate the electromagnetic gauge freedom. To see this set the index b = 5 b = 5 b=5b=5b=5 and we have
(48.7) h μ 5 h μ 5 ξ μ , 5 ξ 5 , μ . (48.7) h μ 5 h μ 5 ξ μ , 5 ξ 5 , μ . {:(48.7)h_(mu5)rarrh_(mu5)-xi_(mu,5)-xi_(5,mu).:}\begin{equation*} h_{\mu 5} \rightarrow h_{\mu 5}-\xi_{\mu, 5}-\xi_{5, \mu} . \tag{48.7} \end{equation*}(48.7)hμ5hμ5ξμ,5ξ5,μ.
5 5 ^(5){ }^{5}5 Remember that this arose from demanding local phase invariance for the matter fields that fill spacetime.
6 6 ^(6){ }^{6}6 In what follows we always let μ μ mu\muμ run over values 0 3 0 3 0-30-303 and we let a = 0 , 1 , 2 , 3 a = 0 , 1 , 2 , 3 a=0,1,2,3a=0,1,2,3a=0,1,2,3 and 5. The reason for the introduction of x 5 x 5 x^(5)x^{5}x5, rather than the more logical x 4 x 4 x^(4)x^{4}x4, is historical: it was conventional to call the timelike component x 4 x 4 x^(4)x^{4}x4 in the older literature, instead of the more modern choice of x 0 x 0 x^(0)x^{0}x0.
7 7 ^(7){ }^{7}7 Remembering that Greek indices like μ μ mu\muμ run over 0 3 0 3 0-30-303 only
8 8 ^(8){ }^{8}8 In these matrix equations, the usua ( 3 + 1 ) ( 3 + 1 ) (3+1)(3+1)(3+1) dimensions described by Greek indices live in the top left block M μ ν M μ ν M_(mu nu)M_{\mu \nu}Mμν, with the new, fifth dimension in the bottom right component M 55 M 55 M_(55)M_{55}M55. The off-diagonal components M μ 5 M μ 5 M_(mu5)M_{\mu 5}Mμ5 and M 5 μ M 5 μ M_(5mu)M_{5 \mu}M5μ mix ( 3 + 1 ) ( 3 + 1 ) (3+1)(3+1)(3+1) dimensions and the fifth dimension.
We then (i) set h μ 5 = A μ h μ 5 = A μ h_(mu5)=ℓA_(mu)h_{\mu 5}=\ell A_{\mu}hμ5=Aμ, where \ell is an arbitrary length; and (ii) assume ξ μ ξ μ xi_(mu)\xi_{\mu}ξμ is independent of x 5 x 5 x^(5)x^{5}x5. The result is that eqn 48.7 becomes
(48.8) A μ A μ ξ 5 , μ (48.8) A μ A μ ξ 5 , μ {:(48.8)ℓA_(mu)rarrℓA_(mu)-xi_(5,mu):}\begin{equation*} \ell A_{\mu} \rightarrow \ell A_{\mu}-\xi_{5, \mu} \tag{48.8} \end{equation*}(48.8)AμAμξ5,μ
which is identical to the electromagnetic gauge transformation in eqn 48.3 if we set χ = ξ 5 / χ = ξ 5 / chi=xi_(5)//ℓ\chi=\xi_{5} / \ellχ=ξ5/. In fact, if we take 7 ξ μ = 0 7 ξ μ = 0 ^(7)xi_(mu)=0{ }^{7} \xi_{\mu}=07ξμ=0 then the electromagnetic gauge transformation becomes
(48.9) x μ x μ , x 5 x 5 + χ ( x μ ) (48.9) x μ x μ , x 5 x 5 + χ x μ {:(48.9)x^(mu)rarrx^(mu)","quadx^(5)rarrx^(5)+ℓchi(x^(mu)):}\begin{equation*} x^{\mu} \rightarrow x^{\mu}, \quad x^{5} \rightarrow x^{5}+\ell \chi\left(x^{\mu}\right) \tag{48.9} \end{equation*}(48.9)xμxμ,x5x5+χ(xμ)
In summary, the electromagnetic gauge transformation has been absorbed into the infinitesimal coordinate transformation. Specifically, we recall that the 1-form A ~ = A μ d x μ A ~ = A μ d x μ tilde(A)=A_(mu)dx^(mu)\tilde{\boldsymbol{A}}=A_{\mu} \boldsymbol{d} x^{\mu}A~=Aμdxμ transforms according to
(48.10) A ~ A ~ d χ (48.10) A ~ A ~ d χ {:(48.10) tilde(A)rarr tilde(A)-d chi:}\begin{equation*} \tilde{A} \rightarrow \tilde{A}-d \chi \tag{48.10} \end{equation*}(48.10)A~A~dχ
Since the coordinate x 5 x 5 x^(5)x^{5}x5 transforms according to x 5 x 5 + χ x 5 x 5 + χ x^(5)rarrx^(5)+ℓchix^{5} \rightarrow x^{5}+\ell \chix5x5+χ then we have d x 5 d x 5 + d χ d x 5 d x 5 + d χ dx^(5)rarr dx^(5)+ℓd chi\boldsymbol{d} x^{5} \rightarrow \boldsymbol{d} x^{5}+\ell \boldsymbol{d} \chidx5dx5+dχ, and we can spot that the combination ( d x 5 + A ~ ) d x 5 + A ~ (dx^(5)+ℓ( tilde(A)))\left(\boldsymbol{d} x^{5}+\ell \tilde{\boldsymbol{A}}\right)(dx5+A~) is gauge invariant. This quantity then, linking a spatial coordinate and the electromagnetic field, is key to unification.
From the point of view of the gauge structure, the extra dimension can be used to bring the electromagnetic gauge transformation down from the fibre bundle and into the heart of spacetime itself. Motivated by this, we shall see in the next section how the extra structure can be used to build a metric that incorporates electromagnetism and gravitation.

48.2 Unifying electromagnetism and gravitation

To simplify our notation a little, let's call the x 5 x 5 x^(5)x^{5}x5 coordinate z z zzz. We write the action for the enlarged spacetime as
(48.11) S = d 4 x d z H m K 3 R ( H ) (48.11) S = d 4 x d z H m K 3 R ( H ) {:(48.11)S=intd^(4)xdzsqrt(-H)m_(K)^(3)R(H):}\begin{equation*} S=\int \mathrm{d}^{4} x \mathrm{~d} z \sqrt{-H} m_{\mathrm{K}}^{3} R(\boldsymbol{H}) \tag{48.11} \end{equation*}(48.11)S=d4x dzHmK3R(H)
where an extra mass factor m K m K m_(K)m_{\mathrm{K}}mK has been included since there are now 5 powers of length in the terms d 4 x d z d 4 x d z d^(4)xdz\mathrm{d}^{4} x \mathrm{~d} zd4x dz. The mass m K m K m_(K)m_{\mathrm{K}}mK sets the scale for five-dimensional gravity, just as the Planck mass m P m P m_(P)m_{\mathrm{P}}mP did in four dimensions. The form of the metric is strongly constrained if we stipulate that it must be gauge invariant. Since the gauge transformation changes z z + χ z z + χ z rarr z+ℓchiz \rightarrow z+\ell \chizz+χ there is only one way a gauge invariant metric line element can be constructed. The line element must be
(48.12) d s 2 = g μ ν d x μ d x ν + ( d z + A μ d x μ ) 2 (48.12) d s 2 = g μ ν d x μ d x ν + d z + A μ d x μ 2 {:(48.12)ds^(2)=g_(mu nu)dx^(mu)dx^(nu)+(dz+ℓA_(mu)dx^(mu))^(2):}\begin{equation*} \mathrm{d} s^{2}=g_{\mu \nu} \mathrm{d} x^{\mu} \mathrm{d} x^{\nu}+\left(\mathrm{d} z+\ell A_{\mu} \mathrm{d} x^{\mu}\right)^{2} \tag{48.12} \end{equation*}(48.12)ds2=gμνdxμdxν+(dz+Aμdxμ)2
Expanding the bracket we can read off the components of the metric, which takes the form of a matrix 8 8 ^(8){ }^{8}8
(48.13) H μ ν = ( g μ ν + 2 A μ A ν A μ A μ 1 ) (48.13) H μ ν = g μ ν + 2 A μ A ν A μ A μ 1 {:(48.13)H_(mu nu)=([g_(mu nu)+ℓ^(2)A_(mu)A_(nu),ℓA_(mu)],[ℓA_(mu),1]):}H_{\mu \nu}=\left(\begin{array}{cc} g_{\mu \nu}+\ell^{2} A_{\mu} A_{\nu} & \ell A_{\mu} \tag{48.13}\\ \ell A_{\mu} & 1 \end{array}\right)(48.13)Hμν=(gμν+2AμAνAμAμ1)
Example 48.1
The (4+1) metric can be used to generate a Ricci scalar R ( H ) R ( H ) R(H)R(\boldsymbol{H})R(H), with the result that 9 9 ^(9){ }^{9}9
(48.14) R ( H ) = R ( g ) 1 4 F μ ν F μ ν (48.14) R ( H ) = R ( g ) 1 4 F μ ν F μ ν {:(48.14)R(H)=R(g)-(1)/(4)F^(mu nu)F_(mu nu):}\begin{equation*} R(\boldsymbol{H})=R(\boldsymbol{g})-\frac{1}{4} F^{\mu \nu} F_{\mu \nu} \tag{48.14} \end{equation*}(48.14)R(H)=R(g)14FμνFμν
This gives us contributions to the action from gravitation and from electromagnetism,
(48.15) S = d 4 x d z H ( R ( g ) 1 4 F μ ν F μ ν ) , (48.15) S = d 4 x d z H R ( g ) 1 4 F μ ν F μ ν , {:(48.15)S=intd^(4)xdzsqrt(-H)(R(g)-(1)/(4)F^(mu nu)F_(mu nu))",":}\begin{equation*} S=\int \mathrm{d}^{4} x \mathrm{~d} z \sqrt{-H}\left(R(\boldsymbol{g})-\frac{1}{4} F^{\mu \nu} F_{\mu \nu}\right), \tag{48.15} \end{equation*}(48.15)S=d4x dzH(R(g)14FμνFμν),
where H = det H a b H = det H a b H=detH_(ab)H=\operatorname{det} H_{a b}H=detHab and with the electromagnetic part having the Lagrangian L = L = L=\mathcal{L}=L= 1 4 F μ F μ ν 1 4 F μ F μ ν -(1)/(4)F^(mu)F_(mu nu)-\frac{1}{4} F^{\mu} F_{\mu \nu}14FμFμν that we met in Chapter 42.
We have seen that for the cost of adding an extra dimension to space, gravitation, and electromagnetism can be combined. But if this is a description of reality, where is this extra dimension and, since we don't appear to have detected it in our measurements, how can it be explored?
Taking our lead from the structure of our fibres in Chapter 44, we propose that the extra dimension is hidden from us by virtue of its being wound, or compactified, into a very small circle of radius a a aaa. As a result, z x 5 z x 5 z-=x^(5)z \equiv x^{5}zx5 varies in the range 0 x 5 2 π a 0 x 5 2 π a 0 <= x^(5) <= 2pi a0 \leq x^{5} \leq 2 \pi a0x52πa. This curvature of space is permitted by general relativity and, since in our units energy has units 1 / L 1 / L 1//L1 / L1/L, we see that potentially enormous energies would be needed to experimentally resolve the dimension if it is small enough. 10 10 ^(10){ }^{10}10 Put another way, in order to escape into this dimension, a particle would need (a huge) momentum of order p 1 / a p 1 / a p~~1//ap \approx 1 / ap1/a. We therefore have the picture of spacetime in Fig. 48.1. It has, at each point x μ x μ x^(mu)x^{\mu}xμ in its (3+1)-dimensional subspace, a dimension resembling a tiny, circular knob. The electromagnetic gauge transformation x 5 x 5 + χ ( x μ ) x 5 x 5 + χ x μ x^(5)rarrx^(5)+ℓchi(x^(mu))x^{5} \rightarrow x^{5}+\ell \chi\left(x^{\mu}\right)x5x5+χ(xμ) corresponds to a rotation of the knobs by different amounts at each point.

Example 48.2

Using this idea we can link the interaction scale m K m K m_(K)m_{\mathrm{K}}mK to the Planck mass m P m P m_(P)m_{\mathrm{P}}mP. In the absence of electromagnetic field, we have H μ ν = g μ ν , H μ 5 = 0 H μ ν = g μ ν , H μ 5 = 0 H_(mu nu)=g_(mu nu),H_(mu5)=0H_{\mu \nu}=g_{\mu \nu}, H_{\mu 5}=0Hμν=gμν,Hμ5=0 and H 55 = 1 H 55 = 1 H_(55)=1H_{55}=1H55=1. The action then becomes
S = 2 π a m K 3 d 4 x g R ( g ) S = 2 π a m K 3 d 4 x g R ( g ) S=2pi am_(K)^(3)intd^(4)xsqrt(-g)R(g)S=2 \pi a m_{\mathrm{K}}^{3} \int \mathrm{~d}^{4} x \sqrt{-g} R(\boldsymbol{g})S=2πamK3 d4xgR(g)
and so we recognize from eqn 48.2 that m P 2 = 2 π a m K 3 m P 2 = 2 π a m K 3 m_(P)^(2)=2pi am_(K)^(3)m_{\mathrm{P}}^{2}=2 \pi a m_{\mathrm{K}}^{3}mP2=2πamK3.
We can use the Kaluza-Klein metric H H H\boldsymbol{H}H to work out the equations of motion for a particle in flat (3+1)-dimensional spacetime.

Example 48.3

We start from the action for a particle, 11 11 ^(11){ }^{11}11 including the coordinate z z zzz, written as
(48.17) S = m [ η μ ν d x μ d z ν + ( d z + A μ d x μ ) 2 ] 1 2 (48.17) S = m η μ ν d x μ d z ν + d z + A μ d x μ 2 1 2 {:(48.17)S=-m int[-eta_(mu nu)dx^(mu)dz^(nu)+(dz+ℓA_(mu)dx^(mu))^(2)]^((1)/(2)):}\begin{equation*} S=-m \int\left[-\eta_{\mu \nu} \mathrm{d} x^{\mu} \mathrm{d} z^{\nu}+\left(\mathrm{d} z+\ell A_{\mu} \mathrm{d} x^{\mu}\right)^{2}\right]^{\frac{1}{2}} \tag{48.17} \end{equation*}(48.17)S=m[ημνdxμdzν+(dz+Aμdxμ)2]12
9 9 ^(9){ }^{9}9 See the exercises and also the book by Zee (2013) for the details of how this is done.
10 10 ^(10){ }^{10}10 This theme of hidden, compactified dimensions is one we pick up in Chapter 49.
Fig. 48.1 Spacetime in the KaluzaKlein theory. At every point in threedimensional space an extra dimension can be found, wound up into a small circle of radius a a aaa.
11 11 ^(11){ }^{11}11 This is simply an extension of the usual action for a particle in the usual action for a particle in
an electromagnetic field of S = S = S=S=S= an electromagnetic field of S = S = S=S=S=
m η μ ν d x μ d x ν + q A σ d x σ m η μ ν d x μ d x ν + q A σ d x σ -m intsqrt(-eta_(mu nu)dx^(mu)dx^(nu))+q intA_(sigma)dx^(sigma)-m \int \sqrt{-\eta_{\mu \nu} \mathrm{d} x^{\mu} \mathrm{d} x^{\nu}}+q \int A_{\sigma} \mathrm{d} x^{\sigma}mημνdxμdxν+qAσdxσ from Chapter 42 .
12 12 ^(12){ }^{12}12 These are F μ ν = A ν , μ A μ , ν F μ ν = A ν , μ A μ , ν F_(mu nu)=A_(nu,mu)-A_(mu,nu)F_{\mu \nu}=A_{\nu, \mu}-A_{\mu, \nu}Fμν=Aν,μAμ,ν.
13 13 ^(13){ }^{13}13 In the last chapter, for example, we predicted with quantum field theory predicted with quantum field theory
(QFT) that the amplitude for gravitongraviton scattering varies with energy E E EEE as G ( 1 + G E 2 + ) G 1 + G E 2 + G(1+GE^(2)+dots)G\left(1+G E^{2}+\ldots\right)G(1+GE2+), where G G GGG is the gravitational constant. Effective field theory says that such an approach is permissible as long as we confine ourselves to the realm of applicability of the theory. In this case, this is E E E≪E \llE G 1 2 G 1 2 G^(-(1)/(2))G^{-\frac{1}{2}}G12. As discussed in the next chapter, this limiting energy scale is that of the Planck mass m P m P m_(P)m_{\mathrm{P}}mP.
14 Effective theories are described in more detail in Zee. This point of view has been influential in Condensed Matter and in Particle Physics, where it follows from analysis using renormalization group (RG) techniques. See our Quantum Field Theory for the Gifted Amateur (2014) for a description of RG.
Using the Euler-Lagrange equations, we end up with two equations of motion. The first says that the momentum p z p z p^(z)p^{z}pz in the z z zzz-direction is constant and given by
(48.18) p z = m ( d z d τ + A μ d x μ d τ ) (48.18) p z = m d z d τ + A μ d x μ d τ {:(48.18)p^(z)=m(((d)z)/((d)tau)+ℓA_(mu)(dx^(mu))/(dtau)):}\begin{equation*} p^{z}=m\left(\frac{\mathrm{~d} z}{\mathrm{~d} \tau}+\ell A_{\mu} \frac{\mathrm{d} x^{\mu}}{\mathrm{d} \tau}\right) \tag{48.18} \end{equation*}(48.18)pz=m( dz dτ+Aμdxμdτ)
The second equation of motion is
(48.19) d d τ ( η μ ν m d x μ d τ + ( p z ) A ν ) = ( p z l ) A λ x ν d x λ d τ (48.19) d d τ η μ ν m d x μ d τ + p z A ν = p z l A λ x ν d x λ d τ {:(48.19)(d)/((d)tau)(-eta_(mu nu)m((d)x^(mu))/(dtau)+(p^(z)ℓ)A_(nu))=(p^(z)l)(delA_(lambda))/(delx^(nu))*(dx^(lambda))/(dtau):}\begin{equation*} \frac{\mathrm{d}}{\mathrm{~d} \tau}\left(-\eta_{\mu \nu} m \frac{\mathrm{~d} x^{\mu}}{\mathrm{d} \tau}+\left(p^{z} \ell\right) A_{\nu}\right)=\left(p^{z} l\right) \frac{\partial A_{\lambda}}{\partial x^{\nu}} \cdot \frac{\mathrm{d} x^{\lambda}}{\mathrm{d} \tau} \tag{48.19} \end{equation*}(48.19)d dτ(ημνm dxμdτ+(pz)Aν)=(pzl)Aλxνdxλdτ
which becomes, on collecting the components of the Faraday tensor 12 F 12 F ^(12)F{ }^{12} \boldsymbol{F}12F and doing some simplifying,
(48.20) m d 2 x μ d τ 2 = ( p z ) F ν μ u ν (48.20) m d 2 x μ d τ 2 = p z F ν μ u ν {:(48.20)m(d^(2)x^(mu))/(dtau^(2))=(p^(z)ℓ)F_(nu)^(mu)u^(nu):}\begin{equation*} m \frac{\mathrm{~d}^{2} x^{\mu}}{\mathrm{d} \tau^{2}}=\left(p^{z} \ell\right) F_{\nu}^{\mu} u^{\nu} \tag{48.20} \end{equation*}(48.20)m d2xμdτ2=(pz)Fνμuν
where u ν u ν u^(nu)u^{\nu}uν are components of the particle's velocity u u u\boldsymbol{u}u. Comparing with what we had in Chapter 42, we see that the electromagnetic charge is given by q = p z q = p z q=p^(z)ℓq=p^{z} \ellq=pz. In words: the momentum along the z z zzz-direction tells us q q qqq, the strength of the interaction between the particle and the electromagnetic field A ~ A ~ tilde(A)\tilde{\boldsymbol{A}}A~. Finally, since the wavefunction for a particle confined to a circle has the form ψ ( z ) e ip p 2 z ψ ( z ) e ip p 2 z psi(z)prope^(ipp^(2)z)\psi(z) \propto \mathrm{e}^{\mathrm{ip} p^{2} z}ψ(z)eipp2z where, in order for the wavefunction to be single valued, we require a quantized ( p z ) n = 2 π n / 2 π a = n / a p z n = 2 π n / 2 π a = n / a (p^(z))_(n)=2pi n//2pi a=n//a\left(p^{z}\right)_{n}=2 \pi n / 2 \pi a=n / a(pz)n=2πn/2πa=n/a, where n n nnn is an integer. This implies that electric charge in this picture must be where n n nnn is an integer
quantized in units of
(48.21) q = a (48.21) q = a {:(48.21)q=(ℓ)/(a):}\begin{equation*} q=\frac{\ell}{a} \tag{48.21} \end{equation*}(48.21)q=a
that is, the ratio of length \ell and the radius of the extra dimension a a aaa.
The unification that Kaluza-Klein theory achieves is very interesting, but ultimately we still lack a quantum theory of gravitation that combines gravity and the standard model of particle physics. The search for one is the subject of our next chapter. Before closing this chapter, we can use the tools we have developed to address a different extension of general relativity: what if, instead of extra dimensions, there are extra interactions?

Example 48.4

The field-theory approach allows us to treat general relativity as an effective theory. The idea here is that our observations are made at low energies and long-length scales (relative to some very small length scale \ell, for example). 13 As 13 As ^(13)As{ }^{13} \mathrm{As}13As a result, the terms in the gravitational action that determine the equations of motion that we can probe in our observations are those where the fields (such as the metric) vary most slowly. It might be that there are really higher order terms in the relativistic action where the fields vary more rapidly, while the ones we have identified represent an effective, lowenergy approximation to a more complete theory of gravitation. 14 14 ^(14){ }^{14}14 The higher order terms are those scalars that involve more derivatives of the metric, so will combine more multiples of the objects formed from the components of R R R\boldsymbol{R}R. We can use these to upgrade our Einstein-Hilbert action from S EH = m P 2 d 4 x g R S EH = m P 2 d 4 x g R S_(EH)=m_(P)^(2)intd^(4)xsqrt(-g)RS_{\mathrm{EH}}=m_{\mathrm{P}}^{2} \int \mathrm{~d}^{4} x \sqrt{-g} RSEH=mP2 d4xgR to
S EH = m P 2 d 4 x g [ R + 2 ( α R 2 + β R μ ν R μ ν + γ R μ ν σ ρ R μ ν σ ρ ) + ] S EH = m P 2 d 4 x g R + 2 α R 2 + β R μ ν R μ ν + γ R μ ν σ ρ R μ ν σ ρ + S_(EH)^(')=m_(P)^(2)intd^(4)xsqrt(-g)[R+ℓ^(2)(alphaR^(2)+betaR_(mu nu)R^(mu nu)+gammaR_(mu nu sigma rho)R^(mu nu sigma rho))+dots]S_{\mathrm{EH}}^{\prime}=m_{\mathrm{P}}^{2} \int \mathrm{~d}^{4} x \sqrt{-g}\left[R+\ell^{2}\left(\alpha R^{2}+\beta R_{\mu \nu} R^{\mu \nu}+\gamma R_{\mu \nu \sigma \rho} R^{\mu \nu \sigma \rho}\right)+\ldots\right]SEH=mP2 d4xg[R+2(αR2+βRμνRμν+γRμνσρRμνσρ)+], (48.22)
where \ell is a length and α , β α , β alpha,beta\alpha, \betaα,β and γ γ gamma\gammaγ are constants. The introduction of the length \ell is required on dimensional grounds, since the terms in the brackets involve two more derivatives of the metric field than R R RRR does. This length allows us to fix the scale at derivatives of the metric field than R R RRR does. This length allows us to fix the scale at
which these extra interactions are important, just as an analogous quantity allowed to pick out the scale at which extra dimensions are important earlier in the chapter.
Fixing a probable value of \ell will occupy us in the next chapter.

Chapter summary

  • Kaluza-Klein theory combines electromagnetism and gravitation via their gauge structure by introducing an extra spatial dimension described by a coordinate x 5 x 5 x^(5)x^{5}x5.
  • A metric for the resulting ( 4 + 1 ) ( 4 + 1 ) (4+1)(4+1)(4+1)-dimensional spacetime incorporates the metric for ( 3 + 1 ) ( 3 + 1 ) (3+1)(3+1)(3+1)-dimensional spacetime along with the electromagnetic field A ~ ( x ) A ~ ( x ) tilde(A)(x)\tilde{\boldsymbol{A}}(x)A~(x)
  • The extra dimension is wound into a tiny circle. Escaping into this dimension costs enormous amounts of energy

Exercises

(48.1) We shall use the metric in eqn 48.13 to show eqn 48.14, using the method from Chapter 36 We follow the steps in the textbook by Zee, which should be consulted for further discussion.
We absorb the factor of \ell into A μ A μ A_(mu)A_{\mu}Aμ so that the metric is
(48.23) d s 2 = g μ ν ω μ ω ν + ( d z + A μ d x μ ) 2 (48.23) d s 2 = g μ ν ω μ ω ν + d z + A μ d x μ 2 {:(48.23)ds^(2)=g_(mu nu)omega^(mu)oxomega^(nu)+(dz+A_(mu)dx^(mu))^(2):}\begin{equation*} \boldsymbol{d} \boldsymbol{s}^{2}=g_{\mu \nu} \boldsymbol{\omega}^{\mu} \otimes \boldsymbol{\omega}^{\nu}+\left(\boldsymbol{d} z+A_{\mu} \boldsymbol{d} x^{\mu}\right)^{2} \tag{48.23} \end{equation*}(48.23)ds2=gμνωμων+(dz+Aμdxμ)2
where Greek indices, as usual, range from 0 to 3 . We have, in the orthonormal frame,
(48.24) d s 2 = η α ^ β ^ ω α ^ ω β ^ + ω 5 ω 5 ^ (48.24) d s 2 = η α ^ β ^ ω α ^ ω β ^ + ω 5 ω 5 ^ {:(48.24)ds^(2)=eta_( hat(alpha) hat(beta))omega^( hat(alpha))oxomega^( hat(beta))+omega^(5)oxomega^( hat(5)):}\begin{equation*} \boldsymbol{d} \boldsymbol{s}^{2}=\eta_{\hat{\alpha} \hat{\beta}} \boldsymbol{\omega}^{\hat{\alpha}} \otimes \boldsymbol{\omega}^{\hat{\beta}}+\boldsymbol{\omega}^{5} \otimes \boldsymbol{\omega}^{\hat{5}} \tag{48.24} \end{equation*}(48.24)ds2=ηα^β^ωα^ωβ^+ω5ω5^
and
(48.25) ω 5 ^ = ( d z + A α ^ d x α ^ ) (48.25) ω 5 ^ = d z + A α ^ d x α ^ {:(48.25)omega^( hat(5))=(dz+A_( hat(alpha))dx^( hat(alpha))):}\begin{equation*} \boldsymbol{\omega}^{\hat{5}}=\left(\boldsymbol{d} z+A_{\hat{\alpha}} \boldsymbol{d} x^{\hat{\alpha}}\right) \tag{48.25} \end{equation*}(48.25)ω5^=(dz+Aα^dxα^)
(a) Evaluate d ω 5 d ω 5 domega^(5)\boldsymbol{d} \boldsymbol{\omega}^{5}dω5 to show (using idea 1 1 1\mathbf{1}1 from Chapter 36) that
(48.26) ω α ^ 5 ^ = 1 2 F α ^ β ^ ω β ^ . (48.26) ω α ^ 5 ^ = 1 2 F α ^ β ^ ω β ^ . {:(48.26)omega_( hat(alpha))^( hat(5))=(1)/(2)F_( hat(alpha) hat(beta))omega^( hat(beta)).:}\begin{equation*} \boldsymbol{\omega}_{\hat{\alpha}}^{\hat{5}}=\frac{1}{2} F_{\hat{\alpha} \hat{\beta}} \boldsymbol{\omega}^{\hat{\beta}} . \tag{48.26} \end{equation*}(48.26)ωα^5^=12Fα^β^ωβ^.
(b) Show further that
(48.27) ω α ^ β ^ = Ω α ^ β ^ 1 2 F α ^ ω 5 ^ , (48.27) ω α ^ β ^ = Ω α ^ β ^ 1 2 F α ^ ω 5 ^ , {:(48.27)omega^( hat(alpha))_( hat(beta))=Omega^( hat(alpha))_( hat(beta))-(1)/(2)*F^( hat(alpha))omega^( hat(5))",":}\begin{equation*} \boldsymbol{\omega}^{\hat{\alpha}}{ }_{\hat{\beta}}=\boldsymbol{\Omega}^{\hat{\alpha}}{ }_{\hat{\beta}}-\frac{1}{2} \cdot F^{\hat{\alpha}} \boldsymbol{\omega}^{\hat{5}}, \tag{48.27} \end{equation*}(48.27)ωα^β^=Ωα^β^12Fα^ω5^,
where Ω α ^ β ^ Ω α ^ β ^ Omega^( hat(alpha))_( hat(beta))\boldsymbol{\Omega}^{\hat{\alpha}}{ }_{\hat{\beta}}Ωα^β^ are the connection 1-forms for the usual four-dimensional metric.
(c) By using Idea 2 from Chapter 36, verify that R α ^ β ^ = d Ω α ^ β ^ 1 2 F α ˙ β ^ , γ ^ ω γ ^ ω 5 ^ 1 2 F α ^ β ^ d ω 5 ^ R α ^ β ^ = d Ω α ^ β ^ 1 2 F α ˙ β ^ , γ ^ ω γ ^ ω 5 ^ 1 2 F α ^ β ^ d ω 5 ^ R^( hat(alpha))_( hat(beta))=dOmega^( hat(alpha))_( hat(beta))-(1)/(2)F^(alpha^(˙))_( hat(beta), hat(gamma))omega^( hat(gamma))^^omega^( hat(5))-(1)/(2)F^( hat(alpha))_( hat(beta))domega^( hat(5))\mathcal{R}^{\hat{\alpha}}{ }_{\hat{\beta}}=\boldsymbol{d} \boldsymbol{\Omega}^{\hat{\alpha}}{ }_{\hat{\beta}}-\frac{1}{2} F^{\dot{\alpha}}{ }_{\hat{\beta}, \hat{\gamma}} \boldsymbol{\omega}^{\hat{\gamma}} \wedge \boldsymbol{\omega}^{\hat{5}}-\frac{1}{2} F^{\hat{\alpha}}{ }_{\hat{\beta}} \boldsymbol{d} \boldsymbol{\omega}^{\hat{5}}Rα^β^=dΩα^β^12Fα˙β^,γ^ωγ^ω5^12Fα^β^dω5^
(48.28) + ( Ω α ^ γ ^ 1 2 F α ^ γ ^ ω 5 ) ( Ω β ^ γ ^ 1 2 F γ ^ β ^ ω 5 ) (48.28) + Ω α ^ γ ^ 1 2 F α ^ γ ^ ω 5 Ω β ^ γ ^ 1 2 F γ ^ β ^ ω 5 {:(48.28)+(Omega^( hat(alpha))_( hat(gamma))-(1)/(2)F^( hat(alpha))_( hat(gamma))omega^(5))^^(Omega_( hat(beta))^( hat(gamma))-(1)/(2)F^( hat(gamma))_( hat(beta))omega^(5)):}\begin{equation*} +\left(\boldsymbol{\Omega}^{\hat{\alpha}}{ }_{\hat{\gamma}}-\frac{1}{2} F^{\hat{\alpha}}{ }_{\hat{\gamma}} \boldsymbol{\omega}^{5}\right) \wedge\left(\boldsymbol{\Omega}_{\hat{\beta}}^{\hat{\gamma}}-\frac{1}{2} F^{\hat{\gamma}}{ }_{\hat{\beta}} \boldsymbol{\omega}^{5}\right) \tag{48.28} \end{equation*}(48.28)+(Ωα^γ^12Fα^γ^ω5)(Ωβ^γ^12Fγ^β^ω5)
( 1 2 F α ^ γ ^ ω γ ^ ) ( 1 2 F β ^ δ ^ ω δ ^ ) 1 2 F α ^ γ ^ ω γ ^ 1 2 F β ^ δ ^ ω δ ^ -((1)/(2)F^( hat(alpha))_( hat(gamma))omega^( hat(gamma)))^^((1)/(2)F_( hat(beta) hat(delta))omega^( hat(delta)))-\left(\frac{1}{2} F^{\hat{\alpha}}{ }_{\hat{\gamma}} \boldsymbol{\omega}^{\hat{\gamma}}\right) \wedge\left(\frac{1}{2} F_{\hat{\beta} \hat{\delta}} \boldsymbol{\omega}^{\hat{\delta}}\right)(12Fα^γ^ωγ^)(12Fβ^δ^ωδ^).
(d) Since we are only trying to compute the Ricci scalar, terms containing ω γ ^ ω 5 ω γ ^ ω 5 omega^( hat(gamma))^^omega^(5)\boldsymbol{\omega}^{\hat{\gamma}} \wedge \boldsymbol{\omega}^{5}ωγ^ω5 do not contribute. Use this fact to express the useful part of the curvature 2-form as
R α ^ β ^ = R ~ α ^ β ^ 1 4 F α ^ β ^ F γ ^ δ ω γ ^ ω δ ^ 1 8 ( F α ^ γ ^ F β ^ δ ^ F δ ^ δ ^ F β ^ γ ^ ) ω γ ^ ω δ ^ + R α ^ β ^ = R ~ α ^ β ^ 1 4 F α ^ β ^ F γ ^ δ ω γ ^ ω δ ^ 1 8 F α ^ γ ^ F β ^ δ ^ F δ ^ δ ^ F β ^ γ ^ ω γ ^ ω δ ^ + {:[R^( hat(alpha))_( hat(beta))= tilde(R)^( hat(alpha))_( hat(beta))-(1)/(4)F^( hat(alpha))_( hat(beta))F_( hat(gamma)delta)omega^( hat(gamma))^^omega^( hat(delta))],[-(1)/(8)(F^( hat(alpha))_( hat(gamma))F_( hat(beta) hat(delta))-F^( hat(delta))_( hat(delta))F_( hat(beta) hat(gamma)))omega^( hat(gamma))^^omega^( hat(delta))+dots]:}\begin{aligned} \mathcal{R}^{\hat{\alpha}}{ }_{\hat{\beta}}= & \tilde{\mathcal{R}}^{\hat{\alpha}}{ }_{\hat{\beta}}-\frac{1}{4} F^{\hat{\alpha}}{ }_{\hat{\beta}} F_{\hat{\gamma} \delta} \omega^{\hat{\gamma}} \wedge \boldsymbol{\omega}^{\hat{\delta}} \\ & -\frac{1}{8}\left(F^{\hat{\alpha}}{ }_{\hat{\gamma}} F_{\hat{\beta} \hat{\delta}}-F^{\hat{\delta}}{ }_{\hat{\delta}} F_{\hat{\beta} \hat{\gamma}}\right) \boldsymbol{\omega}^{\hat{\gamma}} \wedge \boldsymbol{\omega}^{\hat{\delta}}+\ldots \end{aligned}Rα^β^=R~α^β^14Fα^β^Fγ^δωγ^ωδ^18(Fα^γ^Fβ^δ^Fδ^δ^Fβ^γ^)ωγ^ωδ^+
where R ~ α ^ β ^ R ~ α ^ β ^ tilde(R)^( hat(alpha))_( hat(beta))\tilde{\mathcal{R}}^{\hat{\alpha}}{ }_{\hat{\beta}}R~α^β^ is the four-dimensional part.
(e) Use this to show that the components of the Riemann tensor are given by
R β ^ γ ^ δ ^ (48.30) = R ~ α ^ β ^ γ ^ δ ^ 1 2 F β ^ α ^ F γ ^ δ ^ 1 4 ( F α ^ γ ^ F β ^ δ ^ F α ^ δ ^ F β ^ γ ^ ) R β ^ γ ^ δ ^ (48.30) = R ~ α ^ β ^ γ ^ δ ^ 1 2 F β ^ α ^ F γ ^ δ ^ 1 4 F α ^ γ ^ F β ^ δ ^ F α ^ δ ^ F β ^ γ ^ {:[R^( hat(beta) hat(gamma) hat(delta))],[(48.30)= tilde(R)^( hat(alpha))_( hat(beta) hat(gamma) hat(delta))-(1)/(2)F_( hat(beta))^( hat(alpha))F_( hat(gamma) hat(delta))],[-(1)/(4)(F^( hat(alpha))_( hat(gamma))F_( hat(beta) hat(delta))-F^( hat(alpha))_( hat(delta))F_( hat(beta) hat(gamma)))]:}\begin{align*} & R^{\hat{\beta} \hat{\gamma} \hat{\delta}} \\ &= \tilde{R}^{\hat{\alpha}}{ }_{\hat{\beta} \hat{\gamma} \hat{\delta}}-\frac{1}{2} F_{\hat{\beta}}^{\hat{\alpha}} F_{\hat{\gamma} \hat{\delta}} \tag{48.30}\\ &-\frac{1}{4}\left(F^{\hat{\alpha}}{ }_{\hat{\gamma}} F_{\hat{\beta} \hat{\delta}}-F^{\hat{\alpha}}{ }_{\hat{\delta}} F_{\hat{\beta} \hat{\gamma}}\right) \end{align*}Rβ^γ^δ^(48.30)=R~α^β^γ^δ^12Fβ^α^Fγ^δ^14(Fα^γ^Fβ^δ^Fα^δ^Fβ^γ^)
where R ~ α ^ β ^ γ ^ δ ^ R ~ α ^ β ^ γ ^ δ ^ tilde(R)^( hat(alpha))_( hat(beta) hat(gamma) hat(delta))\tilde{R}^{\hat{\alpha}}{ }_{\hat{\beta} \hat{\gamma} \hat{\delta}}R~α^β^γ^δ^ is again the four-dimensional part.
(f) Turning now to the 5 ^ 5 ^ hat(5)\hat{5}5^-components, verify that
R α ^ α ^ = 1 2 F α ^ β ^ , γ ^ ω γ ^ ω β ^ 1 2 F α ^ β ^ Ω α ^ γ ^ ω γ ^ 1 2 F β ^ γ ^ Ω α ^ β ^ ω γ ^ (48.31) 1 4 F β ^ γ ^ F α β ^ ω γ ^ ω 5 ^ . R α ^ α ^ = 1 2 F α ^ β ^ , γ ^ ω γ ^ ω β ^ 1 2 F α ^ β ^ Ω α ^ γ ^ ω γ ^ 1 2 F β ^ γ ^ Ω α ^ β ^ ω γ ^ (48.31) 1 4 F β ^ γ ^ F α β ^ ω γ ^ ω 5 ^ . {:[R_( hat(alpha))^( hat(alpha))=(1)/(2)F_( hat(alpha) hat(beta), hat(gamma))omega^( hat(gamma))^^omega^( hat(beta))-(1)/(2)F_( hat(alpha) hat(beta))Omega^( hat(alpha))_( hat(gamma))^^omega^( hat(gamma))],[-(1)/(2)F_( hat(beta) hat(gamma))Omega_( hat(alpha))^( hat(beta))^^omega^( hat(gamma))],[(48.31)-(1)/(4)F_( hat(beta) hat(gamma))F_(alpha)^( hat(beta))omega^( hat(gamma))^^omega^( hat(5)).]:}\begin{align*} \mathcal{R}_{\hat{\alpha}}^{\hat{\alpha}}= & \frac{1}{2} F_{\hat{\alpha} \hat{\beta}, \hat{\gamma}} \boldsymbol{\omega}^{\hat{\gamma}} \wedge \boldsymbol{\omega}^{\hat{\beta}}-\frac{1}{2} F_{\hat{\alpha} \hat{\beta}} \boldsymbol{\Omega}^{\hat{\alpha}}{ }_{\hat{\gamma}} \wedge \boldsymbol{\omega}^{\hat{\gamma}} \\ & -\frac{1}{2} F_{\hat{\beta} \hat{\gamma}} \boldsymbol{\Omega}_{\hat{\alpha}}^{\hat{\beta}} \wedge \boldsymbol{\omega}^{\hat{\gamma}} \\ & -\frac{1}{4} F_{\hat{\beta} \hat{\gamma}} F_{\alpha}^{\hat{\beta}} \boldsymbol{\omega}^{\hat{\gamma}} \wedge \boldsymbol{\omega}^{\hat{5}} . \tag{48.31} \end{align*}Rα^α^=12Fα^β^,γ^ωγ^ωβ^12Fα^β^Ωα^γ^ωγ^12Fβ^γ^Ωα^β^ωγ^(48.31)14Fβ^γ^Fαβ^ωγ^ω5^.
(g) Use the previous results to compute
(48.32) R α ^ γ ^ γ ^ 5 ~ = 1 4 F β ^ γ ^ F α ^ β ^ (48.32) R α ^ γ ^ γ ^ 5 ~ = 1 4 F β ^ γ ^ F α ^ β ^ {:(48.32)R_( hat(alpha) hat(gamma) hat(gamma))^( tilde(5))=(1)/(4)F_( hat(beta) hat(gamma))F_( hat(alpha))^( hat(beta)):}\begin{equation*} R_{\hat{\alpha} \hat{\gamma} \hat{\gamma}}^{\tilde{5}}=\frac{1}{4} F_{\hat{\beta} \hat{\gamma}} F_{\hat{\alpha}}^{\hat{\beta}} \tag{48.32} \end{equation*}(48.32)Rα^γ^γ^5~=14Fβ^γ^Fα^β^
(h) Now compute the components of the Ricci tensor to find
R β ^ δ ^ = R ~ β ^ δ 1 2 F α ^ F α ^ δ ^ , (48.33) R 5 ^ 5 ^ = 1 4 F γ ^ α ^ F γ ^ α ^ ^ . R β ^ δ ^ = R ~ β ^ δ 1 2 F α ^ F α ^ δ ^ , (48.33) R 5 ^ 5 ^ = 1 4 F γ ^ α ^ F γ ^ α ^ ^ . {:[R_( hat(beta) hat(delta))= tilde(R)_( hat(beta)delta)-(1)/(2)F^( hat(alpha))F_( hat(alpha) hat(delta))","],[(48.33)R_( hat(5) hat(5))=(1)/(4)F_( hat(gamma) hat(alpha))F^( hat(gamma) hat(alpha) hat().)]:}\begin{align*} & R_{\hat{\beta} \hat{\delta}}=\tilde{R}_{\hat{\beta} \delta}-\frac{1}{2} F^{\hat{\alpha}} F_{\hat{\alpha} \hat{\delta}}, \\ & R_{\hat{5} \hat{5}}=\frac{1}{4} F_{\hat{\gamma} \hat{\alpha}} F^{\hat{\gamma} \hat{\alpha} \hat{} .} \tag{48.33} \end{align*}Rβ^δ^=R~β^δ12Fα^Fα^δ^,(48.33)R5^5^=14Fγ^α^Fγ^α^^.
(i) Finally, put these previous two equations together to show
R = R ~ 1 4 F μ ν F μ ν R = R ~ 1 4 F μ ν F μ ν R= tilde(R)-(1)/(4)F^(mu nu)F_(mu nu)R=\tilde{R}-\frac{1}{4} F^{\mu \nu} F_{\mu \nu}R=R~14FμνFμν
(48.34)

From classical to quantum gravity

What are the stars? ... They are bits of fire a few kilometres away. We could reach them if we wanted to. Or we could blot them out. The earth is the centre of the universe. The sun and the stars go round it.George Orwell (1903-1950) Nineteen Eighty-Four

The fundamental interactions in Nature are gravitational, electromagnetic, weak and strong. In the last chapter, we met a scheme aimed at unifying electromagnetism and gravitation into a single master theory, using the concept of a gauge field. It relied on having access to an extra dimension in which to work. The project to combine the fundamental forces of nature into a single theory was one of the greatest triumphs of twentieth century physics. Quantum field theory (QFT) allowed (i) electromagnetism along with (ii) the weak and (iii) the strong interactions to be combined into a quantum gauge field theory known as the Standard Model of particle physics. The fundamental force that resists being incorporated into this scheme on the same basis as the others is gravitation. 1 1 ^(1){ }^{1}1 Starting from the attempt of last chapter to accommodate gravitation through extra spatial dimensions, we shall introduce some more recent attempts that try to make sense of the place of gravitation in the quantum world. 2 2 ^(2){ }^{2}2

49.1 Extra dimensions

We saw in the last chapter the power of having extra spatial dimensions available in formulating theories. If these extra dimensions exist, we should ask ourselves why we haven't yet been able to detect them through their quantum-mechanical effects, given the ease with which we have detected the three spatial dimensions and single time dimension of conventional field theory. In the last chapter, we guessed that this was due to the large energies involved in their small scale, an intuition we confirm in the next example. This idea is that, in quantum mechanics, a particle can lower its kinetic energy if it spreads its wavefunction over a larger area. So if there are extra dimensions that a quantum particle can access, this leads to a different structure of energy levels compared to the case where the particle is confined to three spatial dimensions.
49.1 Extra dimensions 527
49.2 String theory 530
49.3 Parametrizing the string 532
49.4 Strings in relativity 5 3 4 5 3 4 534\mathbf{5 3 4}534
49.5 Superspace 536
49.6 Loop quantum gravity 537
49.7 Anti-de Sitter spacetime 539 49.8 Our current best guess 542 Chapter summary 5 4 5 5 4 5 545\mathbf{5 4 5}545 Exercises 545
1 1 ^(1){ }^{1}1 If we accept the effective-field point of view from the last chapter, it is possible to incorporate gravitation into this scheme (even though, unlike the other forces, gravitation isn't renormalizable) as long as we accept it as a low-energy approximation that we expect to break down at sufficiently high energy.
2 2 ^(2){ }^{2}2 Before getting into the complicated story presented in this chapter, we note, following an argument by Sidney Coleman, that invoking a little quantum mechanics immediately allows us a quick route to gravitational redshift. In a uniform gravitational field, the freea uniform gravitational field, the free-
particle Hamiltonian H = p 0 = m H = p 0 = m H=p^(0)=mH=p^{0}=mH=p0=m picks particle Hamiltonian H = p 0 = m H = p 0 = m H=p^(0)=mH=p^{0}=mH=p0=m picks
up an extra potential energy term m g h m g h mghm g hmgh, where V = g h V = g h V=ghV=g hV=gh is equal to the gravitational potential, so we can write the effect of gravity by saying H H + H V H H + H V H rarr H+HVH \rightarrow H+H VHH+HV. The time evolution operator for quantum states, determined by the Hamiltonian, then changes by the presence of the potential according to
U ^ = e i H ^ t / h e i H ^ ( 1 + V ) t / U ^ = e i H ^ t / h e i H ^ ( 1 + V ) t / hat(U)=e^(-i hat(H)t//h)rarre^(-i hat(H)(1+V)t//ℏ)\hat{U}=\mathrm{e}^{-\mathrm{i} \hat{H} t / h} \rightarrow \mathrm{e}^{-\mathrm{i} \hat{H}(1+V) t / \hbar}U^=eiH^t/heiH^(1+V)t/. (49.1)
Gravitation can thus be included in our equations by making the swap to the time parameter t ( 1 + V ) t t ( 1 + V ) t t rarr(1+V)tt \rightarrow(1+V) tt(1+V)t. This tells us that clocks run slowest deep down in the potential, where h h hhh and therefore V V VVV are smallest.
Fig. 49.1 (a) Two-dimensional square well. (b) The well compactified in the y y yyy direction.

Example 49.1

Consider a particle in a one-dimensional square well, extended to include an extra dimension. The resulting two-dimensional well is shown in Fig. 49.1(a). It has length L L LLL in the x x xxx direction. We suppose that the reason the extra ( y y yyy ) dimension is not apparent is that it has been compactified, or curled up into a circle. To describe this we can identify ( x , y ) ( x , y ) (x,y)(x, y)(x,y) and ( x , y + 2 π a ) ( x , y + 2 π a ) (x,y+2pi a)(x, y+2 \pi a)(x,y+2πa), and we have the situation shown in Fig. 49.1(b). The space is now a cylinder with circular cross section of circumference 2 π a 2 π a 2pi a2 \pi a2πa. The Schrödinger equation for a particle confined in this space is
(49.2) 2 2 m ( 2 ψ x 2 + 2 ψ y 2 ) = E ψ (49.2) 2 2 m 2 ψ x 2 + 2 ψ y 2 = E ψ {:(49.2)-(ℏ^(2))/(2m)((del^(2)psi)/(delx^(2))+(del^(2)psi)/(dely^(2)))=E psi:}\begin{equation*} -\frac{\hbar^{2}}{2 m}\left(\frac{\partial^{2} \psi}{\partial x^{2}}+\frac{\partial^{2} \psi}{\partial y^{2}}\right)=E \psi \tag{49.2} \end{equation*}(49.2)22m(2ψx2+2ψy2)=Eψ
The solutions for the square well can be written down. They are
(49.3) ψ = a n sin ( n π x L ) [ b l sin ( l y a ) + c l cos ( l y a ) ] (49.3) ψ = a n sin n π x L b l sin l y a + c l cos l y a {:(49.3)psi=a_(n)sin((n pi x)/(L))[b_(l)sin((ly)/(a))+c_(l)cos((ly)/(a))]:}\begin{equation*} \psi=a_{n} \sin \left(\frac{n \pi x}{L}\right)\left[b_{l} \sin \left(\frac{l y}{a}\right)+c_{l} \cos \left(\frac{l y}{a}\right)\right] \tag{49.3} \end{equation*}(49.3)ψ=ansin(nπxL)[blsin(lya)+clcos(lya)]
where a n , b l a n , b l a_(n),b_(l)a_{n}, b_{l}an,bl and c l c l c_(l)c_{l}cl are set of constants and n n nnn and l l lll are integer quantum numbers. The resulting energies corresponding to these eigenfunctions are
(49.4) E n l = 2 2 m [ ( n π L ) 2 + ( l a ) 2 ] (49.4) E n l = 2 2 m n π L 2 + l a 2 {:(49.4)E_(nl)=(ℏ^(2))/(2m)[((n pi)/(L))^(2)+((l)/(a))^(2)]:}\begin{equation*} E_{n l}=\frac{\hbar^{2}}{2 m}\left[\left(\frac{n \pi}{L}\right)^{2}+\left(\frac{l}{a}\right)^{2}\right] \tag{49.4} \end{equation*}(49.4)Enl=22m[(nπL)2+(la)2]
The quantum number l l lll can vanish, and so if we set it to zero we recover the usual one-dimensional energy levels. New energy levels only arise for l > 0 l > 0 l > 0l>0l>0. If we assume the extra dimensions are curled up into a very small circle, then a L a L a≪La \ll LaL and the second term in eqn 49.4 is large compared to the first. The result is that the extra energy levels lie at E 2 / 2 m a 2 E 2 / 2 m a 2 E~~ℏ^(2)//2ma^(2)E \approx \hbar^{2} / 2 m a^{2}E2/2ma2, which is potentially a very large energy indeed. So, since a a aaa is assumed small, new energy-levels appear only at very large E E EEE.
The last example shows how extra dimensions, if compactified on a very small scale, might not necessarily give rise to measurable effects since they occur at such high energies compared to the capabilities of our measurements.
To get an idea of the length scales involved in these arguments, we can use a set of units originally formulated by Max Planck. René Descartes had originally proposed that there might be an underlying system of units that allowed the geometry of the Universe to be expressed in a natural manner. Around 250 years later, Planck proposed such a set of units, that combined three important constants of Nature, and so link gravity, relativity, and quantum mechanics. The interaction of these different effects should be expected to occur on this Planck scale.

Example 49.2

The idea is to use dimensional analysis to express the gravitational constant G G GGG, the speed of light c c ccc, and the Planck constant \hbar in terms of a basic length, mass, and time scale ( P , m P P , m P ℓ_(P),m_(P)\ell_{\mathrm{P}}, m_{\mathrm{P}}P,mP and t P t P t_(P)t_{\mathrm{P}}tP, respectively). From the dimensions of these constants we find
(49.5) G = P m P t P 2 , c = P t P , = m P P 2 t P . (49.5) G = P m P t P 2 , c = P t P , = m P P 2 t P . {:(49.5)G=(ℓ_(P))/(m_(P)t_(P)^(2))","quad c=(ℓ_(P))/(t_(P))","quadℏ=(m_(P)ℓ_(P)^(2))/(t_(P)).:}\begin{equation*} G=\frac{\ell_{\mathrm{P}}}{m_{\mathrm{P}} t_{\mathrm{P}}^{2}}, \quad c=\frac{\ell_{\mathrm{P}}}{t_{\mathrm{P}}}, \quad \hbar=\frac{m_{\mathrm{P}} \ell_{\mathrm{P}}^{2}}{t_{\mathrm{P}}} . \tag{49.5} \end{equation*}(49.5)G=PmPtP2,c=PtP,=mPP2tP.
In order to express the constants of Nature that we measure, the Planck scales are defined as follows (in S.I. units)
P = ( G c 3 ) 1 2 = 1.616 × 10 35 m t P = ( G c 5 ) 1 2 = 5.391 × 10 44 s (49.6) m P = ( c G ) 1 2 = 2.176 × 10 8 kg P = G c 3 1 2 = 1.616 × 10 35 m t P = G c 5 1 2 = 5.391 × 10 44 s (49.6) m P = c G 1 2 = 2.176 × 10 8 kg {:[ℓ_(P)=((Gℏ)/(c^(3)))^((1)/(2))=1.616 xx10^(-35)m],[t_(P)=((Gℏ)/(c^(5)))^((1)/(2))=5.391 xx10^(-44)s],[(49.6)m_(P)=((ℏc)/(G))^((1)/(2))=2.176 xx10^(-8)kg]:}\begin{align*} \ell_{\mathrm{P}} & =\left(\frac{G \hbar}{c^{3}}\right)^{\frac{1}{2}}=1.616 \times 10^{-35} \mathrm{~m} \\ t_{\mathrm{P}} & =\left(\frac{G \hbar}{c^{5}}\right)^{\frac{1}{2}}=5.391 \times 10^{-44} \mathrm{~s} \\ m_{\mathrm{P}} & =\left(\frac{\hbar c}{G}\right)^{\frac{1}{2}}=2.176 \times 10^{-8} \mathrm{~kg} \tag{49.6} \end{align*}P=(Gc3)12=1.616×1035 mtP=(Gc5)12=5.391×1044 s(49.6)mP=(cG)12=2.176×108 kg
The fact that the underlying Planck length scale is so small gives us reason to pause: the effects of quantum mechanics (via expressions involving \hbar ) and gravity (involving G G GGG ) together are likely only to become clear in processes sharing this underlying scale of units. 3 3 ^(3){ }^{3}3
In natural units, where = c = 1 = c = 1 ℏ=c=1\hbar=c=1=c=1, length is inversely proportional to momentum and hence also to energy. Exploring a length scale of 10 35 m 10 35 m 10^(-35)m10^{-35} \mathrm{~m}1035 m implies we explore energies 4 4 ^(4){ }^{4}4 of 10 19 GeV 10 19 GeV 10^(19)GeV10^{19} \mathrm{GeV}1019GeV, which is well beyond the measurement capability of any existing accelerators. 5 5 ^(5){ }^{5}5 As a result, what is going on at the Planck length scale remains mysterious. It might well be, for example, that we have the situation proposed in the previous chapter, where extra dimensions are compactified on length scales of P P ℓ_(P)\ell_{\mathrm{P}}P, and they have hence escaped our notice. So if extra dimensions are at least possible from this point of view, it has been suggested that these might provide the necessary backdrop to formulate a unified theory that includes quantum fields and also gravitation.
If we accept that it might be plausible for extra spatial dimensions to exist, what would be their effect on gravity? In four dimensions [i.e. ( 3 + 1 ) ( 3 + 1 ) (3+1)(3+1)(3+1)-dimensional spacetime], the law of Newtonian gravitation says
(49.7) 2 Φ ( 4 ) = 4 π G ρ (49.7) 2 Φ ( 4 ) = 4 π G ρ {:(49.7)grad^(2)Phi^((4))=4pi G rho:}\begin{equation*} \nabla^{2} \Phi^{(4)}=4 \pi G \rho \tag{49.7} \end{equation*}(49.7)2Φ(4)=4πGρ
where Φ ( D ) ( x ) Φ ( D ) ( x ) Phi^((D))(x)\Phi^{(D)}(x)Φ(D)(x) is the gravitational potential for D D DDD-dimensional spacetime. 6 6 ^(6){ }^{6}6 By definition, the mass density is always given by the mass divided by the number of spatial dimensions d d ddd, so if we increase the number of dimensions, the constant of gravitation will have to alter, so that we have
(49.8) 2 Φ ( D ) = 4 π G ( D ) ρ (49.8) 2 Φ ( D ) = 4 π G ( D ) ρ {:(49.8)grad^(2)Phi^((D))=4piG^((D))rho:}\begin{equation*} \nabla^{2} \Phi^{(D)}=4 \pi G^{(D)} \rho \tag{49.8} \end{equation*}(49.8)2Φ(D)=4πG(D)ρ
Given the dimensionality of the mass density ρ ρ rho\rhoρ, this latter equation also implies on dimensional grounds that the gravitational force falls off as 1 / r d 1 1 / r d 1 1//r^(d-1)1 / r^{d-1}1/rd1. That is, the force law differs in different numbers of dimensions.

Example 49.3

If there were five dimensions, with one compactified with radius a a aaa, we would have
(49.10) 2 Φ ( 5 ) = 4 π G ( 5 ) ρ ( 5 ) = 4 π G ( 5 ) 2 π a ρ ( 4 ) (49.10) 2 Φ ( 5 ) = 4 π G ( 5 ) ρ ( 5 ) = 4 π G ( 5 ) 2 π a ρ ( 4 ) {:(49.10)grad^(2)Phi^((5))=4piG^((5))rho^((5))=4pi(G^((5)))/(2pi a)*rho^((4)):}\begin{equation*} \nabla^{2} \Phi^{(5)}=4 \pi G^{(5)} \rho^{(5)}=4 \pi \frac{G^{(5)}}{2 \pi a} \cdot \rho^{(4)} \tag{49.10} \end{equation*}(49.10)2Φ(5)=4πG(5)ρ(5)=4πG(5)2πaρ(4)
This would imply that the measured gravitation constant G G GGG is actually G = G ( 5 ) 2 π a G = G ( 5 ) 2 π a G=(G^((5)))/(2pi a)G=\frac{G^{(5)}}{2 \pi a}G=G(5)2πa. From this we can deduce that
(49.11) G ( 5 ) G = 2 π a = c , (49.11) G ( 5 ) G = 2 π a = c , {:(49.11)(G^((5)))/(G)=2pi a=ℓ_(c)",":}\begin{equation*} \frac{G^{(5)}}{G}=2 \pi a=\ell_{\mathrm{c}}, \tag{49.11} \end{equation*}(49.11)G(5)G=2πa=c,
3 3 ^(3){ }^{3}3 It is notable that the Planck mass is much larger than that of any elemental particle and seems the least experimentally inaccessible of the Planck units. The Planck energy scale E P = m P c 2 E P = m P c 2 E_(P)=m_(P)c^(2)E_{\mathrm{P}}=m_{\mathrm{P}} c^{2}EP=mPc2 that derives from this is 1.22 × 10 28 eV 1.22 × 10 28 eV 1.22 xx10^(28)eV1.22 \times 10^{28} \mathrm{eV}1.22×1028eV (often quoted as 1.22 × 10 19 GeV 1.22 × 10 19 GeV 1.22 xx10^(19)GeV1.22 \times 10^{19} \mathrm{GeV}1.22×1019GeV ), which is seven or eight order of magnitudes larger than the highest energy cosmic ray yet observed. One can speculate whether or not this observation is significant (see Exercise 49.1).
4 4 ^(4){ }^{4}4 It is simplest to simply compute E P = E P = E_(P)=E_{\mathrm{P}}=EP= m P c 2 m P c 2 m_(P)c^(2)m_{\mathrm{P}} c^{2}mPc2, as mentioned in the previous sidenote.
5 5 ^(5){ }^{5}5 The Large Hadron Collider at CERN, a multi-billion dollar project, accelerates proton beams up to nearly 7 TeV per beam. This is incredibly impressive, but a TeV is only 1000 GeV . We are a long way off 10 19 GeV 10 19 GeV 10^(19)GeV10^{19} \mathrm{GeV}1019GeV.
6 6 ^(6){ }^{6}6 We use D D DDD for the number of spacetime dimensions and d = D 1 d = D 1 d=D-1d=D-1d=D1 for the number of spatial dimensions.
The Planck length can be defined in other dimensions. Using some dimensional analysis one can show
( P ( D ) ) D 2 = G ( D ) c 3 = P 2 G ( D ) G ( 49.9 ) P ( D ) D 2 = G ( D ) c 3 = P 2 G ( D ) G ( 49.9 ) (ℓ_(P)^((D)))^(D-2)=(ℏG^((D)))/(c^(3))=ℓ_(P)^(2)(G^((D)))/(G_((49.9)))\left(\ell_{\mathrm{P}}^{(D)}\right)^{D-2}=\frac{\hbar G^{(D)}}{c^{3}}=\ell_{\mathrm{P}}^{2} \frac{G^{(D)}}{G_{(49.9)}}(P(D))D2=G(D)c3=P2G(D)G(49.9)
Fig. 49.2 String ending on a 2-brane. 7 7 ^(7){ }^{7}7 'D' here stands for Dirichlet, or Pe ter Gustav Lejeune Dirichlet (18051859). Have fixed string ends is equivalent to a Dirichlet boundary condition X i / τ = 0 X i / τ = 0 delX^(i)//del tau=0\partial X^{i} / \partial \tau=0Xi/τ=0 at the end of the string.
8 8 ^(8){ }^{8}8 As a result of this striking feature, string theory has been viewed as the natural successor to QFT in attempting to find a consistent quantum gravity. It also has a natural description of a S = 2 S = 2 S=2S=2S=2 massless excitation, i.e. a graviton, leading some to suggest that string theory might have gravitation built into it.
Fig. 49.3 (a) A particle decay process in quantum field theory. (b) A particle decay in string theory.
where c c ℓ_(c)\ell_{\mathrm{c}}c is the characteristic length scale of the compacted dimension.
The generalization of the last example to D D DDD-dimensional spacetime is that
(49.12) G ( D ) G = c ( D 4 ) (49.12) G ( D ) G = c ( D 4 ) {:(49.12)(G^((D)))/(G)=ℓ_(c)^((D-4)):}\begin{equation*} \frac{G^{(D)}}{G}=\ell_{\mathrm{c}}^{(D-4)} \tag{49.12} \end{equation*}(49.12)G(D)G=c(D4)
The quantity on the right is the volume of the extra dimensions V c V c V_(c)\mathcal{V}_{\mathrm{c}}Vc. This gives us an answer to how dimensionality changes the strength of gravitation.

49.2 String theory

The idea of extra dimensions is taken up in the formalism of string theory. This theory shares many of the techniques of QFT, but involves replacing the idea of fundamental particle-like excitations of fields with excitations of fundamental one-dimensional strings. In short, strings are one-dimensional objects with no internal structure, and particles are excitations of these strings. Strings are not built from parts and so we can't deal with some part of a string. Strings come in many forms: they can be open (meaning the ends do not join) or closed (where they form loops). Open strings can have free endpoints or fixed end points. The objects on which strings terminate to fix the ends are characterized by their dimensionality and are called D-branes. 7 7 ^(7){ }^{7}7 An example of a string ending on a 2-brane is shown in Fig. 49.2.
First some good news: the use of strings instead of particles largely resolves one of the most persistent conceptual problems with QFT.

Example 49.4

A problem that bedevils QFT is the divergence of quantities and the consequent need for renormalization. A particle decay process in QFT is depicted in the Feynman diagram shown in Fig. 49.3(a), involving the interaction of fields (or particles) at a point. This is the source of the problem: the evaluation of the fields at the same point can cause contributions to perturbation theory to diverge, necessitating the removal of infinities using a number of sophisticated techniques, known as renormalization, whose validity have been the subject of persistent debate. Even if we trust renormalization, it fails to remove the infinities encountered in quantum gravity in the sorts ization, it fails to remove the infinities encountered in quantum gravity in the sorts
of scattering processes described in Chapter 47. The QFT divergences are avoided to of scattering processes described in Chapter 47 . The QFI divergences are ave in string theory, where particle processes resemble the example shown some extent in string theory, where particle processes resemble the example shown
in Fig. 49.3(b). Here the particles, represented as closed strings, never meet at a point. 8 8 ^(8){ }^{8}8
If we accept that strings might be of some use, then we should examine how to incorporate them into field theory. Fortunately, we can use many of the techniques developed in the course of this book to investigate strings. To get an idea how to do this, let's return to the relativistic description of particles and build a string theory from there.
A free particle of mass m m mmm has a world line found by extremizing an action
(49.13) S = m d τ ( g μ ν d x μ d τ d x ν d τ ) 1 2 (49.13) S = m d τ g μ ν d x μ d τ d x ν d τ 1 2 {:(49.13)S=-m intdtau(-g_(mu nu)(dx^(mu))/(dtau)*((d)x^(nu))/(dtau))^((1)/(2)):}\begin{equation*} S=-m \int \mathrm{~d} \tau\left(-g_{\mu \nu} \frac{\mathrm{d} x^{\mu}}{\mathrm{d} \tau} \cdot \frac{\mathrm{~d} x^{\nu}}{\mathrm{d} \tau}\right)^{\frac{1}{2}} \tag{49.13} \end{equation*}(49.13)S=m dτ(gμνdxμdτ dxνdτ)12
Here the world line is parametrized by τ τ tau\tauτ. This parametrization is usually the proper time for massive particles, but could be chosen to be another affine parameter. 9 9 ^(9){ }^{9}9 We use a set of coordinates x μ x μ x^(mu)x^{\mu}xμ and capture the motion of the particle with which we can express the world line as a curve x μ ( τ ) x μ ( τ ) x^(mu)(tau)x^{\mu}(\tau)xμ(τ). The equation of motion derived from the particle's action is d p μ ( τ ) / d τ = 0 d p μ ( τ ) / d τ = 0 dp_(mu)(tau)//dtau=0\mathrm{d} p_{\mu}(\tau) / \mathrm{d} \tau=0dpμ(τ)/dτ=0, where the momentum p μ = L / x ˙ μ p μ = L / x ˙ μ p_(mu)=del L//delx^(˙)^(mu)p_{\mu}=\partial L / \partial \dot{x}^{\mu}pμ=L/x˙μ.
The treatment of the string is very similar to that of the particle. The major difference is that since the string is a one-dimensional object, rather than a world line, it traces out a two-dimensional world sheet. Just as we parametrize the world line with a parameter τ τ tau\tauτ, we parametrize the world sheet with two parameters τ τ tau\tauτ and σ σ sigma\sigmaσ. The names are chosen to allow us to think of the τ τ tau\tauτ parameter as telling us about the behaviour with respect to time and σ σ sigma\sigmaσ the coordinate telling us how far we are along the string. 10 10 ^(10){ }^{10}10 For an open string σ σ sigma\sigmaσ ranges from 0 to some maximum σ σ sigma_(**)\sigma_{*}σ. The world sheet is potentially curved in an interesting way and so we seek to embed it in Euclidean space, in the same way that we describe in Appendix D.
We shall choose to work in Euclidean space with coordinates x μ = x μ = x^(mu)=x^{\mu}=xμ= ( t , x , y , z ) ( t , x , y , z ) (t,x,y,z)(t, x, y, z)(t,x,y,z). In this space, positions on the string's world sheet have coordinates X μ = ( T , X , Y , Z ) X μ = ( T , X , Y , Z ) X^(mu)=(T,X,Y,Z)X^{\mu}=(T, X, Y, Z)Xμ=(T,X,Y,Z), so that a position vector of a point on the string world sheet is written as X ( τ , σ ) X ( τ , σ ) X(tau,sigma)\boldsymbol{X}(\tau, \sigma)X(τ,σ) [i.e. input τ τ tau\tauτ and σ σ sigma\sigmaσ labelling the part of the world sheet, return coordinates x μ = ( t , x , y , z ) = x μ = ( t , x , y , z ) = x^(mu)=(t,x,y,z)=x^{\mu}=(t, x, y, z)=xμ=(t,x,y,z)= ( T , X , Y , Z ) = X μ ( T , X , Y , Z ) = X μ (T,X,Y,Z)=X^(mu)(T, X, Y, Z)=X^{\mu}(T,X,Y,Z)=Xμ labelling the point on the world sheet]. The goal is to describe the world sheet of the string, some examples of which are shown in Fig. 49.4. To do this we embed the world sheet in Euclidean space, and then find the induced metric γ γ gamma\gammaγ of the world sheet. As discussed in Appendix D, the metric γ γ gamma\gammaγ has components given by
(49.14) γ α β = η μ ν X μ ξ α X ν ξ β (49.14) γ α β = η μ ν X μ ξ α X ν ξ β {:(49.14)gamma_(alpha beta)=eta_(mu nu)(delX^(mu))/(delxi^(alpha))*(delX^(nu))/(delxi^(beta)):}\begin{equation*} \gamma_{\alpha \beta}=\eta_{\mu \nu} \frac{\partial X^{\mu}}{\partial \xi^{\alpha}} \cdot \frac{\partial X^{\nu}}{\partial \xi^{\beta}} \tag{49.14} \end{equation*}(49.14)γαβ=ημνXμξαXνξβ
and here we choose coordinates ξ 1 = τ ξ 1 = τ xi^(1)=tau\xi^{1}=\tauξ1=τ and ξ 2 = σ ξ 2 = σ xi^(2)=sigma\xi^{2}=\sigmaξ2=σ. To simplify our expressions we define some notation
(49.15) X ˙ μ = X μ τ , ( X μ ) = X μ σ . (49.15) X ˙ μ = X μ τ , X μ = X μ σ . {:(49.15)X^(˙)^(mu)=(delX^(mu))/(del tau)","quad(X^(mu))^(')=(delX^(mu))/(del sigma).:}\begin{equation*} \dot{X}^{\mu}=\frac{\partial X^{\mu}}{\partial \tau}, \quad\left(X^{\mu}\right)^{\prime}=\frac{\partial X^{\mu}}{\partial \sigma} . \tag{49.15} \end{equation*}(49.15)X˙μ=Xμτ,(Xμ)=Xμσ.
Using this notation, the induced metric can be written as a 2 × 2 2 × 2 2xx22 \times 22×2 matrix
(49.16) γ α β ( τ , σ ) = ( ( X ˙ ) 2 X ˙ X X ˙ X ( X ) 2 ) (49.16) γ α β ( τ , σ ) = ( X ˙ ) 2 X ˙ X X ˙ X X 2 {:(49.16)gamma_(alpha beta)(tau","sigma)=([(X^(˙))^(2),X^(˙)*X^(')],[X^(˙)*X^('),(X^('))^(2)]):}\gamma_{\alpha \beta}(\tau, \sigma)=\left(\begin{array}{cc} (\dot{\boldsymbol{X}})^{2} & \dot{\boldsymbol{X}} \cdot \boldsymbol{X}^{\prime} \tag{49.16}\\ \dot{\boldsymbol{X}} \cdot \boldsymbol{X}^{\prime} & \left(\boldsymbol{X}^{\prime}\right)^{2} \end{array}\right)(49.16)γαβ(τ,σ)=((X˙)2X˙XX˙X(X)2)
where the products are things like X ˙ X = η μ ν X ˙ μ ( X ) ν X ˙ X = η μ ν X ˙ μ X ν X^(˙)*X^(')=eta_(mu nu)X^(˙)^(mu)(X^('))^(nu)\dot{\boldsymbol{X}} \cdot \boldsymbol{X}^{\prime}=\eta_{\mu \nu} \dot{X}^{\mu}\left(X^{\prime}\right)^{\nu}X˙X=ημνX˙μ(X)ν, and so on. This provides us with a simple way to describe the world sheet.
Just as the action for a particle is proportional to the interval along the world line, the action for the string is proportional the effective area
9 9 ^(9){ }^{9}9 Indeed the ability to reparametrize the world line in Chapter 8 led to substantial simplifications. This is also the case in the next section.
10 10 ^(10){ }^{10}10 There is nothing forcing us to do this though. While such an interpretation isn't mandated, it provides the simplest description of the string's motion.
Fig. 49.4 (a) Open string world sheet. (b) Closed string world sheet.
of the world sheet. For the world sheet with a metric γ γ gamma\gammaγ, the area A A AAA is
(49.17) A = d τ d σ γ (49.17) A = d τ d σ γ {:(49.17)A=intdtaudsigmasqrt(-gamma):}\begin{equation*} A=\int \mathrm{d} \tau \mathrm{~d} \sigma \sqrt{-\gamma} \tag{49.17} \end{equation*}(49.17)A=dτ dσγ
where γ γ gamma\gammaγ is the determinant of the matrix γ α β γ α β gamma_(alpha beta)\gamma_{\alpha \beta}γαβ. Using eqn 49.16 we can compute the determinant as γ = ( X ˙ ) 2 ( X ) 2 ( X X ) 2 γ = ( X ˙ ) 2 X 2 X X 2 gamma=(X^(˙))^(2)(X^('))^(2)-(X*X^('))^(2)\gamma=(\dot{\boldsymbol{X}})^{2}\left(\boldsymbol{X}^{\prime}\right)^{2}-\left(\boldsymbol{X} \cdot \boldsymbol{X}^{\prime}\right)^{2}γ=(X˙)2(X)2(XX)2. Finally, the constant of proportionality that relates the area to the string action is the tension in the string T 0 T 0 T_(0)T_{0}T0, which plays a role like the mass does in the particle theory.
With these ingredients we can write the resulting Nambu-Goto
Yoichiro Nambu (1921-2015) and Tetsuo Goto (1931-1982)
11 11 ^(11){ }^{11}11 Recall from Chapter 40 that the Euler-Lagrange equation for a Lagrangian density L L L\mathcal{L}L built from scalar fields is
(49.20) L ϕ = ( L ϕ , μ ) , μ (49.20) L ϕ = L ϕ , μ , μ {:(49.20)(delL)/(del phi)=((delL)/(delphi_(,mu)))_(,mu):}\begin{equation*} \frac{\partial \mathcal{L}}{\partial \phi}=\left(\frac{\partial \mathcal{L}}{\partial \phi_{, \mu}}\right)_{, \mu} \tag{49.20} \end{equation*}(49.20)Lϕ=(Lϕ,μ),μ
For a Lagrangian built from vector field we saw in Exercise 40.3 that we can identify momentum-like variables
(49.21) P μ ν = L X μ , ν (49.21) P μ ν = L X μ , ν {:(49.21)P_(mu)^(nu)=(delL)/(delX^(mu),nu):}\begin{equation*} \mathcal{P}_{\mu}^{\nu}=\frac{\partial \mathcal{L}}{\partial X^{\mu}, \nu} \tag{49.21} \end{equation*}(49.21)Pμν=LXμ,ν
where ν = τ ν = τ nu=tau\nu=\tauν=τ and σ σ sigma\sigmaσ and μ = t , x , y μ = t , x , y mu=t,x,y\mu=t, x, yμ=t,x,y and z z zzz. We then write an Euler Lagrange equation
L X μ = ( L X μ , ν ) , ν (49.22) = ( P μ ν ) , ν L X μ = L X μ , ν , ν (49.22) = P μ ν , ν {:[(delL)/(delX^(mu))=((delL)/(delX^(mu),nu))_(,nu)],[(49.22)=(P_(mu)^(nu))_(,nu)]:}\begin{align*} \frac{\partial \mathcal{L}}{\partial X^{\mu}} & =\left(\frac{\partial \mathcal{L}}{\partial X^{\mu}, \nu}\right)_{, \nu} \\ & =\left(\mathcal{P}_{\mu}^{\nu}\right)_{, \nu} \tag{49.22} \end{align*}LXμ=(LXμ,ν),ν(49.22)=(Pμν),ν
where we sum over ν = τ ν = τ nu=tau\nu=\tauν=τ and σ σ sigma\sigmaσ.
Fig. 49.5 Static gauge involves setting τ = t τ = t tau=t\tau=tτ=t. The plane t = t = t=t=t= const. then cuts the world sheet at τ = t τ = t tau=t\tau=tτ=t
(49.18) S = T 0 d τ d σ [ ( X X ) 2 ( X ˙ ) 2 ( X ) 2 ] 1 2 (49.18) S = T 0 d τ d σ X X 2 ( X ˙ ) 2 X 2 1 2 {:(49.18)S=-T_(0)intdtaudsigma[(X*X^('))^(2)-((X^(˙)))^(2)(X^('))^(2)]^((1)/(2)):}\begin{equation*} S=-T_{0} \int \mathrm{~d} \tau \mathrm{~d} \sigma\left[\left(\boldsymbol{X} \cdot \boldsymbol{X}^{\prime}\right)^{2}-(\dot{\boldsymbol{X}})^{2}\left(\boldsymbol{X}^{\prime}\right)^{2}\right]^{\frac{1}{2}} \tag{49.18} \end{equation*}(49.18)S=T0 dτ dσ[(XX)2(X˙)2(X)2]12
This action corresponds to a Lagrangian density of
(49.19) L = T 0 [ ( X X ) 2 ( X ˙ ) 2 ( X ) 2 ] 1 2 (49.19) L = T 0 X X 2 ( X ˙ ) 2 X 2 1 2 {:(49.19)L=-T_(0)[(X*X^('))^(2)-((X^(˙)))^(2)(X^('))^(2)]^((1)/(2)):}\begin{equation*} \mathcal{L}=-T_{0}\left[\left(\boldsymbol{X} \cdot \boldsymbol{X}^{\prime}\right)^{2}-(\dot{\boldsymbol{X}})^{2}\left(\boldsymbol{X}^{\prime}\right)^{2}\right]^{\frac{1}{2}} \tag{49.19} \end{equation*}(49.19)L=T0[(XX)2(X˙)2(X)2]12
Given this Lagrangian, we can compute equations of motion.

Example 49.5

There are two variables in the problem and so we can identify two momentum-like variables. 11 11 ^(11){ }^{11}11 We write
(49.23) P μ τ = L X ˙ μ = T 0 L [ ( X ˙ X ) X μ ( X ) 2 X ˙ μ , ] (49.23) P μ τ = L X ˙ μ = T 0 L X ˙ X X μ X 2 X ˙ μ , {:(49.23)P_(mu)^(tau)=(delL)/(delX^(˙)^(mu))=-(T_(0))/(L)[((X^(˙))*X^('))X_(mu)^(')-(X^('))^(2)X^(˙)_(mu),]:}\begin{equation*} P_{\mu}^{\tau}=\frac{\partial \mathcal{L}}{\partial \dot{X}^{\mu}}=-\frac{T_{0}}{\mathcal{L}}\left[\left(\dot{\boldsymbol{X}} \cdot \boldsymbol{X}^{\prime}\right) X_{\mu}^{\prime}-\left(\boldsymbol{X}^{\prime}\right)^{2} \dot{X}_{\mu},\right] \tag{49.23} \end{equation*}(49.23)Pμτ=LX˙μ=T0L[(X˙X)Xμ(X)2X˙μ,]
and
(49.24) P μ σ = L ( X ) μ = T 0 L [ ( X ˙ X ) X ˙ μ X ˙ 2 X μ ] (49.24) P μ σ = L X μ = T 0 L X ˙ X X ˙ μ X ˙ 2 X μ {:(49.24)P_(mu)^(sigma)=(delL)/(del(X^('))^(mu))=-(T_(0))/(L)[((X^(˙))*X^('))X^(˙)_(mu)-X^(˙)^(2)X_(mu)^(')]:}\begin{equation*} P_{\mu}^{\sigma}=\frac{\partial \mathcal{L}}{\partial\left(X^{\prime}\right)^{\mu}}=-\frac{T_{0}}{\mathcal{L}}\left[\left(\dot{\boldsymbol{X}} \cdot \boldsymbol{X}^{\prime}\right) \dot{X}_{\mu}-\dot{\boldsymbol{X}}^{2} X_{\mu}^{\prime}\right] \tag{49.24} \end{equation*}(49.24)Pμσ=L(X)μ=T0L[(X˙X)X˙μX˙2Xμ]
Using these momenta, the equations of motion are
(49.25) P μ τ τ + P μ σ σ = 0 (49.25) P μ τ τ + P μ σ σ = 0 {:(49.25)(delP_(mu)^(tau))/(del tau)+(delP_(mu)^(sigma))/(del sigma)=0:}\begin{equation*} \frac{\partial P_{\mu}^{\tau}}{\partial \tau}+\frac{\partial P_{\mu}^{\sigma}}{\partial \sigma}=0 \tag{49.25} \end{equation*}(49.25)Pμττ+Pμσσ=0
When written out, these look rather complicated, but they can be simplified, as we shall see in the next section.

49.3 Parametrizing the string

We can choose the parametrizations to simplify the equations of motion as much as possible, just as we did in Chapter 9. In this context, the choice of parametrization is known as choosing a gauge. First we make a choice of the timelike parameter τ τ tau\tauτ. The most direct choice is to treat this as the coordinate time by fixing τ = t τ = t tau=t\tau=tτ=t, which is a choice known as static gauge. In this gauge, lines of constant τ τ tau\tauτ represent static strings, i.e. an observer [with coordinates ( t , x , y , z ) ] ( t , x , y , z ) ] (t,x,y,z)](t, x, y, z)](t,x,y,z)] will see a static string at a
fixed time t t ttt, as shown in Fig. 49.5. In static gauge, the derivatives of the world sheet coordinates X μ = ( t , X i ) X μ = t , X i X^(mu)=(t,X^(i))X^{\mu}=\left(t, X^{i}\right)Xμ=(t,Xi) have components
(49.26) X μ d σ = ( 0 , X σ ) , X μ d τ = ( 1 , X t ) , (49.26) X μ d σ = 0 , X σ , X μ d τ = 1 , X t , {:(49.26)(delX^(mu))/(dsigma)=(0,(del( vec(X)))/(del sigma))","quad(delX^(mu))/(dtau)=(1,(del( vec(X)))/(del t))",":}\begin{equation*} \frac{\partial X^{\mu}}{\mathrm{d} \sigma}=\left(0, \frac{\partial \vec{X}}{\partial \sigma}\right), \quad \frac{\partial X^{\mu}}{\mathrm{d} \tau}=\left(1, \frac{\partial \vec{X}}{\partial t}\right), \tag{49.26} \end{equation*}(49.26)Xμdσ=(0,Xσ),Xμdτ=(1,Xt),
where the 3 -vector X X vec(X)\vec{X}X has components X i X i X^(i)X^{i}Xi. Static gauge allows us to work in terms of the velocities (i.e. derivatives with respect to time t = τ t = τ t=taut=\taut=τ ). The most useful of these is the transverse velocity that we discuss in the next example.

Example 49.6

Consider a static string at a fixed value of t = τ t = τ t=taut=\taut=τ. An element of string length can be written using an infinitesimal d s d s ds\mathrm{d} sds, where
(49.27) d s = | d X d σ | d σ (49.27) d s = d X d σ d σ {:(49.27)ds=|(d( vec(X)))/((d)sigma)|dsigma:}\begin{equation*} \mathrm{d} s=\left|\frac{\mathrm{d} \vec{X}}{\mathrm{~d} \sigma}\right| \mathrm{d} \sigma \tag{49.27} \end{equation*}(49.27)ds=|dX dσ|dσ
Here, σ σ sigma\sigmaσ is the parameter along the length of the static string and the vector d X / d σ d X / d σ d vec(X)//dsigma\mathrm{d} \vec{X} / \mathrm{d} \sigmadX/dσ is tangent to the string. We can then deduce, using the previous equation, that
(49.28) X s X s = ( X σ ) 2 ( σ s ) 2 = 1 (49.28) X s X s = X σ 2 σ s 2 = 1 {:(49.28)(del( vec(X)))/(del s)*(del( vec(X)))/(del s)=((del( vec(X)))/(del sigma))^(2)((del sigma)/(del s))^(2)=1:}\begin{equation*} \frac{\partial \vec{X}}{\partial s} \cdot \frac{\partial \vec{X}}{\partial s}=\left(\frac{\partial \vec{X}}{\partial \sigma}\right)^{2}\left(\frac{\partial \sigma}{\partial s}\right)^{2}=1 \tag{49.28} \end{equation*}(49.28)XsXs=(Xσ)2(σs)2=1
so the quantity X / s X / s del vec(X)//del s\partial \vec{X} / \partial sX/s is a unit vector. The vectors X / σ X / σ del vec(X)//del sigma\partial \vec{X} / \partial \sigmaX/σ and X / s X / s del vec(X)//del s\partial \vec{X} / \partial sX/s are parallel since X / s = ( X / σ ) ( σ / s ) X / s = ( X / σ ) ( σ / s ) del vec(X)//del s=(del vec(X)//del sigma)(del sigma//del s)\partial \vec{X} / \partial s=(\partial \vec{X} / \partial \sigma)(\partial \sigma / \partial s)X/s=(X/σ)(σ/s), so X / s X / s del vec(X)//del s\partial \vec{X} / \partial sX/s is a unit tangent vector to the string.
To project out the part of a vector v v v\boldsymbol{v}v that is perpendicular to a unit vector n n n\boldsymbol{n}n we write v = v = ( v n ) n v = v = ( v n ) n v_(_|_)=v=(v*n)nv_{\perp}=\boldsymbol{v}=(\boldsymbol{v} \cdot \boldsymbol{n}) \boldsymbol{n}v=v=(vn)n. With this simple rule we can use the string coordinate write v = v ( v n ) n v = v ( v n ) n v_(_|_)=v-(v*n)nv_{\perp}=\boldsymbol{v}-(\boldsymbol{v} \cdot \boldsymbol{n}) \boldsymbol{n}v=v(vn)n. With this simple rule we can use the
velocity v = X / t v = X / t vec(v)=del vec(X)//del t\vec{v}=\partial \vec{X} / \partial tv=X/t and obtain the transverse velocity v v vec(v)_(_|_)\vec{v}_{\perp}v thus
(49.29) v = X t ( X t X s ) X s (49.29) v = X t X t X s X s {:(49.29) vec(v)_(_|_)=(del( vec(X)))/(del t)-((del( vec(X)))/(del t)*(del( vec(X)))/(del s))(del( vec(X)))/(del s):}\begin{equation*} \vec{v}_{\perp}=\frac{\partial \vec{X}}{\partial t}-\left(\frac{\partial \vec{X}}{\partial t} \cdot \frac{\partial \vec{X}}{\partial s}\right) \frac{\partial \vec{X}}{\partial s} \tag{49.29} \end{equation*}(49.29)v=Xt(XtXs)Xs
We then use this transverse velocity to simplify the string action. From the component eqns 49.26, we find
(49.30) X ˙ 2 = 1 + ( X t ) 2 , ( X ) 2 = ( X σ ) 2 (49.30) X ˙ 2 = 1 + X t 2 , X 2 = X σ 2 {:(49.30)X^(˙)^(2)=-1+((del( vec(X)))/(del t))^(2)","quad(X^('))^(2)=((del( vec(X)))/(del sigma))^(2):}\begin{equation*} \dot{\boldsymbol{X}}^{2}=-1+\left(\frac{\partial \vec{X}}{\partial t}\right)^{2}, \quad\left(\boldsymbol{X}^{\prime}\right)^{2}=\left(\frac{\partial \vec{X}}{\partial \sigma}\right)^{2} \tag{49.30} \end{equation*}(49.30)X˙2=1+(Xt)2,(X)2=(Xσ)2
Putting everything together, the part in the square root of the string Lagrangian in eqn 49.19 becomes
( X X ) 2 ( X ˙ ) 2 ( X ) 2 = ( X t X σ ) 2 + [ 1 ( X t ) 2 ] ( X σ ) 2 = ( d s d σ ) 2 [ ( X t X s ) 2 + 1 ( X t ) 2 ] (49.31) = ( d s d σ ) 2 ( 1 v 2 ) X X 2 ( X ˙ ) 2 X 2 = X t X σ 2 + 1 X t 2 X σ 2 = d s d σ 2 X t X s 2 + 1 X t 2 (49.31) = d s d σ 2 1 v 2 {:[(X*X^('))^(2)-(X^(˙))^(2)(X^('))^(2)=((del( vec(X)))/(del t)*(del( vec(X)))/(del sigma))^(2)+[1-((del( vec(X)))/(del t))^(2)]((del( vec(X)))/(del sigma))^(2)],[=((ds)/((d)sigma))^(2)[((del( vec(X)))/(del t)*(del( vec(X)))/(del s))^(2)+1-((del( vec(X)))/(del t))^(2)]],[(49.31)=((ds)/((d)sigma))^(2)(1-v_(_|_)^(2))]:}\begin{align*} \left(\boldsymbol{X} \cdot \boldsymbol{X}^{\prime}\right)^{2}-(\dot{\boldsymbol{X}})^{2}\left(\boldsymbol{X}^{\prime}\right)^{2} & =\left(\frac{\partial \vec{X}}{\partial t} \cdot \frac{\partial \vec{X}}{\partial \sigma}\right)^{2}+\left[1-\left(\frac{\partial \vec{X}}{\partial t}\right)^{2}\right]\left(\frac{\partial \vec{X}}{\partial \sigma}\right)^{2} \\ & =\left(\frac{\mathrm{d} s}{\mathrm{~d} \sigma}\right)^{2}\left[\left(\frac{\partial \vec{X}}{\partial t} \cdot \frac{\partial \vec{X}}{\partial s}\right)^{2}+1-\left(\frac{\partial \vec{X}}{\partial t}\right)^{2}\right] \\ & =\left(\frac{\mathrm{d} s}{\mathrm{~d} \sigma}\right)^{2}\left(1-v_{\perp}^{2}\right) \tag{49.31} \end{align*}(XX)2(X˙)2(X)2=(XtXσ)2+[1(Xt)2](Xσ)2=(ds dσ)2[(XtXs)2+1(Xt)2](49.31)=(ds dσ)2(1v2)
Using the results from the last example, we can write down a simplified version of the string action
(49.32) S = T 0 d τ d σ [ d s d σ ( 1 v 2 ) 1 2 ] (49.32) S = T 0 d τ d σ d s d σ 1 v 2 1 2 {:(49.32)S=-T_(0)intdtaudsigma[((d)s)/((d)sigma)(1-v_(_|_)^(2))^((1)/(2))]:}\begin{equation*} S=-T_{0} \int \mathrm{~d} \tau \mathrm{~d} \sigma\left[\frac{\mathrm{~d} s}{\mathrm{~d} \sigma}\left(1-v_{\perp}^{2}\right)^{\frac{1}{2}}\right] \tag{49.32} \end{equation*}(49.32)S=T0 dτ dσ[ ds dσ(1v2)12]
Although this is certainly an improvement, one final choice of σ σ sigma\sigmaσ parametrization gives us the simplified equation of motion that we're after.

Example 49.7

We shall attempt to fix the parametrization of σ σ sigma\sigmaσ such that the string velocity v v vec(v)\vec{v}v is perpendicular to the string tangent, which would mean
(49.33) X t X σ = 0 (49.33) X t X σ = 0 {:(49.33)(del( vec(X)))/(del t)*(del( vec(X)))/(del sigma)=0:}\begin{equation*} \frac{\partial \vec{X}}{\partial t} \cdot \frac{\partial \vec{X}}{\partial \sigma}=0 \tag{49.33} \end{equation*}(49.33)XtXσ=0
This is useful as the transverse velocity becomes v = X t v = X t vec(v)_(_|_)=(del( vec(X)))/(del t)\vec{v}_{\perp}=\frac{\partial \vec{X}}{\partial t}v=Xt, and, in terms of this v v vec(v)_(_|_)\vec{v}_{\perp}v we have that the momenta from eqns 49.25 can be rewritten as
P τ μ = T 0 ( 1 v 2 ) 1 2 d s d σ X μ t (49.34) P σ μ = T 0 ( 1 v 2 ) 1 2 X μ s P τ μ = T 0 1 v 2 1 2 d s d σ X μ t (49.34) P σ μ = T 0 1 v 2 1 2 X μ s {:[P^(tau mu)=T_(0)(1-v_(_|_)^(2))^(-(1)/(2))((d)s)/((d)sigma)(delX^(mu))/(del t)],[(49.34)P^(sigma mu)=-T_(0)(1-v_(_|_)^(2))^((1)/(2))(delX^(mu))/(del s)]:}\begin{align*} & P^{\tau \mu}=T_{0}\left(1-v_{\perp}^{2}\right)^{-\frac{1}{2}} \frac{\mathrm{~d} s}{\mathrm{~d} \sigma} \frac{\partial X^{\mu}}{\partial t} \\ & P^{\sigma \mu}=-T_{0}\left(1-v_{\perp}^{2}\right)^{\frac{1}{2}} \frac{\partial X^{\mu}}{\partial s} \tag{49.34} \end{align*}Pτμ=T0(1v2)12 ds dσXμt(49.34)Pσμ=T0(1v2)12Xμs
Plugging these latter equations into the Euler-Lagrange equation gives us the equation of motion
(49.35) 2 X t 2 = ( 1 v 2 ) 1 2 d σ d s σ [ ( 1 v 2 ) 1 2 d σ d s X σ ] (49.35) 2 X t 2 = 1 v 2 1 2 d σ d s σ 1 v 2 1 2 d σ d s X σ {:(49.35)(del^(2)( vec(X)))/(delt^(2))=(1-v_(_|_)^(2))^((1)/(2))((d)sigma)/((d)s)*(del)/(del sigma)[(1-v_(_|_)^(2))^((1)/(2))((d)sigma)/((d)s)*(del( vec(X)))/(del sigma)]:}\begin{equation*} \frac{\partial^{2} \vec{X}}{\partial t^{2}}=\left(1-v_{\perp}^{2}\right)^{\frac{1}{2}} \frac{\mathrm{~d} \sigma}{\mathrm{~d} s} \cdot \frac{\partial}{\partial \sigma}\left[\left(1-v_{\perp}^{2}\right)^{\frac{1}{2}} \frac{\mathrm{~d} \sigma}{\mathrm{~d} s} \cdot \frac{\partial \vec{X}}{\partial \sigma}\right] \tag{49.35} \end{equation*}(49.35)2Xt2=(1v2)12 dσ dsσ[(1v2)12 dσ dsXσ]
The choice of the parametrization σ σ sigma\sigmaσ that allows both the preceding chain of simplifications is to fix
(49.36) d s d σ ( 1 v 2 ) 1 2 = 1 (49.36) d s d σ 1 v 2 1 2 = 1 {:(49.36)(ds)/((d)sigma)(1-v_(_|_)^(2))^(-(1)/(2))=1:}\begin{equation*} \frac{\mathrm{d} s}{\mathrm{~d} \sigma}\left(1-v_{\perp}^{2}\right)^{-\frac{1}{2}}=1 \tag{49.36} \end{equation*}(49.36)ds dσ(1v2)12=1
This is a choice compatible with eqn 49.33, and also gives rise to the goal: a simplified equation of motion reading
(49.37) 2 X t 2 = 2 X σ 2 (49.37) 2 X t 2 = 2 X σ 2 {:(49.37)(del^(2)( vec(X)))/(delt^(2))=(del^(2)( vec(X)))/(delsigma^(2)):}\begin{equation*} \frac{\partial^{2} \vec{X}}{\partial t^{2}}=\frac{\partial^{2} \vec{X}}{\partial \sigma^{2}} \tag{49.37} \end{equation*}(49.37)2Xt2=2Xσ2
This is a wave equation, whose solution (the dynamics of the string) can be written down: they are simply plane waves.
Our conclusion is that, with the right choice of parameters for the world sheet, the strings support wavelike excitations. These can be used to describe particle excitations. Further development of the theory would involve quantizing the string motion to extract the properties of the particles. 12 12 ^(12){ }^{12}12 For example, the graviton can be identified as a vibrational state of a closed string.

49.4 Strings in relativity

Strings crop up in several places in general relativity. Most prominent are superstrings, which are strings with dimensions of the Planck length that possess supersymmetry.
14 14 ^(14){ }^{14}14 All dark matter searches performed using detectors on Earth tacitly as sume that candidate dark matter particles interact with ordinary matter by a non-gravitational force, e.g. electromagnetic coupling. Such experiments have not yet succeeded in detecting any such particles. But perhaps dark matter particles only interact gravitationally, explaining why they are so hard to find in particle physics experiments.
12 12 ^(12){ }^{12}12 This is discussed in the book by Zwiebach (2009).
13 13 ^(13){ }^{13}13 The gravitational interaction be tween matter in a galaxy and dark matter leads to the formation of a dark matter halo surrounding each galaxy, inferred from the observed effect on the motion of stars orbiting the galaxy.

.

Example 49.8

Observations suggest that most of the matter in the Universe is not seen: it is dark matter, which does not interact with light. In fact, it only seems to interact via its gravitational effect, 13 13 ^(13){ }^{13}13 explaining why it has never been directly detected. 14 14 ^(14){ }^{14}14 that are designed to One solution to this problem invokes supersymmetry. In supersymmetry, every boson particle in the Universe has a partner Fermi particle of the same mass (and vice versa). We can then identify a symmetry operator with the property that
Q ^ boson =∣ fermion , Q ^  boson  =∣  fermion  , hat(Q)∣" boson ":)=∣" fermion ":),\hat{Q} \mid \text { boson }\rangle=\mid \text { fermion }\rangle,Q^ boson =∣ fermion ,
(49.38) Q ^ fermion =∣ boson (49.38) Q ^  fermion  =∣  boson  {:(49.38) hat(Q)∣" fermion ":)=∣" boson ":):}\begin{equation*} \hat{Q} \mid \text { fermion }\rangle=\mid \text { boson }\rangle \tag{49.38} \end{equation*}(49.38)Q^ fermion =∣ boson 
Acting on boson A A AAA with Q ^ Q ^ hat(Q)\hat{Q}Q^ gives fermion A A AAA-ino. (Example: Q ^ Q ^ hat(Q)\hat{Q}Q^ acting on a photon γ γ gamma\gammaγ, gives a photonino γ ~ γ ~ tilde(gamma)\tilde{\gamma}γ~.) Acting on fermion B B BBB gives boson s B s B sB\mathrm{s} BsB. (Example: Q ^ Q ^ hat(Q)\hat{Q}Q^ acting on a quark q q qqq gives a squark q ~ q ~ tilde(q)\tilde{q}q~.)
Perhaps these new particles generated by supersymmetry compose dark matter. If so, the darkness of dark matter follows from quantum mechanical selection rules that suppress the probability amplitudes for matter to interact with dark matter. There is currently no experimental evidence for the existence of these extra particles.
Superstring theory allows the incorporation of supersymmetric particles into the theory. A generalization of the original superstring models is called m m mmm-brane theory, or M M MMM-theory, where m m mmm is the number of dimensions. (The theory suggests that up to m = 9 m = 9 m=9m=9m=9 branes can exist.) It is hoped that M M MMM-theory could allow a consistent theory to be developed that incorporates gravitation and the other three fundamental interactions. However, it is a well-known problem that M M MMM-theory has not yet been able to make falsifiable predictions that can be tested by experiment. We have seen how extra dimensions should lead to a modification of the dependence of the gravitational force with distance and so, at the Planck length, measurable differences might be detectable. Of course, we have no means of making measurements at the Planck length at present.
In a completely different context, there are suggestions that there exists another sort of string in the Universe, this type potentially spanning large distances. These cosmic strings are created in the early Universe on a microscopic length scale and stretched out during the subsequent expansion. They would have a gravitational effect owing to their tension meaning they could be detected via gravitational lensing (see Chapter 24). The lensing effect due to the string is predicted to cause a distant light source (e.g. a star) to show up as two images, owing the curvature of spacetime around the string.

Example 49.9

The origin of cosmic strings follows from the phase transition in the early Universe, that we discussed in Chapter 41. The most simple system showing a phase transition is that of the scalar field where, on cooling, the potential changes from that shown in Fig. 49.6(a) to that in 49.6(b), with the assumption that the system falls into one of these two minima, which occur at a field ϕ = ± ϕ 0 ϕ = ± ϕ 0 phi=+-phi_(0)\phi= \pm \phi_{0}ϕ=±ϕ0. However, in any phase transition, we have the possibility of forming domains. These result in the system breaking symmetry in different ways in different regions of space. That is, part of the system falls into one minimum in the potential, and part falls into another. For the scalar field we might have the situation shown in Fig. 49.7. On the left the system has fallen into the minimum in the broken symmetry potential at ϕ 0 ϕ 0 -phi_(0)-\phi_{0}ϕ0; on the right it has fallen into the minimum at ϕ 0 ϕ 0 phi_(0)\phi_{0}ϕ0. The space between these two domains involves a field that must smoothly evolve between the ϕ 0 ϕ 0 -phi_(0)-\phi_{0}ϕ0 and ϕ 0 ϕ 0 phi_(0)\phi_{0}ϕ0. These regions are called defects. A one-dimensional defect, known as a wall or a kink, is shown in Fig. 49.7.
Fig. 49.6 The potential discussed in Chapter 41: (a) shows the high temperature potential; (b) shows the broken symmetry potential at low temperature.
Fig. 49.7 A domain wall linking two regions where symmetry is broken in different ways.
Fig. 49.8 The broken symmetry potential for the complex scalar field.
Fig. 49.9 The vortex field configuration.
15 15 ^(15){ }^{15}15 Further information on defects and vortices in field theory can be found in our Quantum Field Theory for the Gifted Amateur.
16 16 ^(16){ }^{16}16 The curvature around the cosmic string is examined further in the exercises.
A more interesting example takes place for the complex scalar field ψ ( x ) ψ ( x ) psi(x)\psi(x)ψ(x). The broken-symmetry potential-energy surface for this field is shown in Fig. 49.8. It is two dimensional, reflecting the two degrees of freedom that the complex field possesses (i.e. the real and imaginary parts). The potential resembles the punt at the bottom of a wine bottle, or a Mexican hat (it is sometimes called the Mexican-hat potential for this reason). There are an infinite number of minima in this potential that occur at the same radius | ϕ 0 | ϕ 0 |phi_(0)|\left|\phi_{0}\right||ϕ0| in the complex plane, but at different values of the complex phase θ ( x ) θ ( x ) theta(x)\theta(x)θ(x). In choosing a broken symmetry ground state, the complex field ψ ( x ) = | ψ ( x ) | e i θ ( x ) ψ ( x ) = | ψ ( x ) | e i θ ( x ) psi(x)=|psi(x)|e^(itheta(x))\psi(x)=|\psi(x)| \mathrm{e}^{\mathrm{i} \theta(x)}ψ(x)=|ψ(x)|eiθ(x) picks a particular phase value | ψ 0 | e i θ 0 ψ 0 e i θ 0 |psi_(0)|e^(itheta_(0))\left|\psi_{0}\right| \mathrm{e}^{\mathrm{i} \theta_{0}}|ψ0|eiθ0 by selecting the phase angle θ 0 θ 0 theta_(0)\theta_{0}θ0.
The defects in this potential are shown (in two spatial dimensions) in Fig. 49.9, and are known as vortices. 15 15 ^(15){ }^{15}15 Like the domain wall, the vortex can be understood as the simplest way in which the system can have different spatial parts in different minima of the potential. The resulting vortex has the feature that the gradient in the phase θ ( x ) θ ( x ) grad theta(x)\boldsymbol{\nabla} \theta(x)θ(x) diverges at the centre of the vortex, giving rise to a singularity. To translate this picture into three dimensions, we imagine stacking vortices on top of each other, such that the singularities form a one-dimensional path. This curve through the vortex cores is the cosmic string. 16 16 ^(16){ }^{16}16
Finally, it has been suggested that Schwarzschild black holes might in fact be strings, whose physics is then amenable to a description using string theory. There are some hints that the properties of such a black hole coincide with those of a string, although we do not yet have conclusive evidence for this.

49.5 Superspace

String theory represents only one of many possible approaches that attempt to reconcile general relativity and quantum mechanics. A very different strategy is to accept that the uncertainty inherent in the quantummechanical description of particle mechanics means that spacetime, with its rigid (3+1)-dimensional structure, can itself only be a classical approximation. In fact, it approximates a more subtle and complex state of affairs that allows quantum states to be realized with certain probability amplitudes ψ ψ psi\psiψ. This is quite a radical approach, in that it implies that the successful unification of quantum mechanics and gravitation involves abandoning the picture of a structured spacetime describing events, as the basic arena of gravitation.
In order to build a more suitable foundation, we start with classical spacetime and note that we can deconstruct it by taking threedimensional spacelike slices, or 3 -surfaces ( 3 ) C ( 3 ) C ^((3))C{ }^{(3)} \mathcal{C}(3)C. There is some freedom in how we do this, but this is subject to the constraint that we can rebuild spacetime by stacking up the slices in a well-defined manner. Next, in order to accommodate the probabilistic features of quantum mechanics, our collection of spacelike surfaces must be vastly increased to form a superspace. This superspace will include all possible 3 -surfaces, which in a quantum theory, each occur with a particular probability amplitude ψ ( ( 3 ) C ) ψ ( 3 ) C psi(^((3))C)\psi\left({ }^{(3)} \mathcal{C}\right)ψ((3)C).
Example 49.10
The quantum properties of a system are described in terms of a quantum amplitude, which is determined through the combination of the phases of interfering wavelike contributions. In Feynman's 'sum over histories' description of quantum mechanics, 17 17 ^(17){ }^{17}17 the phase of each contribution is determined by the classical action S ( ( 3 ) C ) S ( 3 ) C S(^((3))C)S\left({ }^{(3)} \mathcal{C}\right)S((3)C) of the corresponding configuration of 3 -space ( 3 ) C ( 3 ) C ^((3))C{ }^{(3)} \mathcal{C}(3)C, such that each contribution to the wavefunction take the form
(49.39) ψ ( ( 3 ) C ) exp [ i S ( ( 3 ) C ) ] (49.39) ψ ( 3 ) C exp i S ( 3 ) C {:(49.39)psi(^((3))C)∼exp[(iS(^((3))C))/(ℏ)]:}\begin{equation*} \psi\left({ }^{(3)} \mathcal{C}\right) \sim \exp \left[\frac{\mathrm{i} S\left({ }^{(3)} \mathcal{C}\right)}{\hbar}\right] \tag{49.39} \end{equation*}(49.39)ψ((3)C)exp[iS((3)C)]
To obtain the quantum probability amplitudes, we must then sum all of the possible ψ ( ( 3 ) C ) ψ ( 3 ) C psi(^((3))C)\psi\left({ }^{(3)} \mathcal{C}\right)ψ((3)C) that are compatible with the problem we're considering.
We obtain constructive interference when the actions for two configurations match up. This implies that the dynamics of quantum gravity can be determined by computing the details of how wavefronts of constant action S S SSS propagate throughout the superspace. The equation governing this (entirely classical) propagation is known as the Einstein-Hamilton-Jacobi equation 18 18 ^(18){ }^{18}18 and is given by
(49.40) 1 γ : ( 1 2 γ i j γ k l γ i k γ j l ) δ S δ γ i j δ S δ γ k l + γ ( 3 ) R = 0 (49.40) 1 γ : 1 2 γ i j γ k l γ i k γ j l δ S δ γ i j δ S δ γ k l + γ ( 3 ) R = 0 {:(49.40)(1)/(sqrtgamma):((1)/(2)gamma_(ij)gamma_(kl)-gamma_(ik)gamma_(jl))(delta S)/(deltagamma_(ij))*(delta S)/(deltagamma_(kl))+sqrtgamma^((3))R=0:}\begin{equation*} \frac{1}{\sqrt{\gamma}}:\left(\frac{1}{2} \gamma_{i j} \gamma_{k l}-\gamma_{i k} \gamma_{j l}\right) \frac{\delta S}{\delta \gamma_{i j}} \cdot \frac{\delta S}{\delta \gamma_{k l}}+\sqrt{\gamma}^{(3)} R=0 \tag{49.40} \end{equation*}(49.40)1γ:(12γijγklγikγjl)δSδγijδSδγkl+γ(3)R=0
where γ i j γ i j gamma_(ij)\gamma_{i j}γij are components of the three-dimensional metric describing a hypersurface and γ γ gamma\gammaγ is its determinant. Here, ( 3 ) R ( 3 ) R ^((3))R{ }^{(3)} R(3)R is the Ricci scalar for the 3-geometry. Remarkably, this one equation contains the same information as the Einstein field equation.
The fluctuations that characterize the quantum world (e.g. the zeropoint fluctuations in a quantum oscillator) are expected to occur in the metric field on scale of the Planck length. In the superspace approach, quantum fluctuations on this scale cause the probability amplitudes ψ ( ( 3 ) C i ) ψ ( 3 ) C i psi(^((3))C_(i))\psi\left({ }^{(3)} \mathcal{C}_{i}\right)ψ((3)Ci) for a range of 3 -surfaces ( 3 ) C i ( 3 ) C i ^((3))C_(i){ }^{(3)} \mathcal{C}_{i}(3)Ci to take on appreciable values. This leads to a fundamental limit to how well the spacetime picture adopted elsewhere in general relativity approximates the fluctuating reality of the underlying quantum world.
Ultimately, the superspace approach, with its abstract geometry containing all of the 3-surfaces, and classical equation of motion for describing the phases corresponding to each, does not make any strong claims about the underlying structure of the interactions that allow quantum mechanics and gravitation to coexist and interact. This is, to its supporters, a positive aspect, since it represents a conservative and robust approach that relies on the metric field, instead of novel and unobserved features in Nature, such as those that are found in string theory.

49.6 Loop quantum gravity

An alternative approach to string theory that aims to combine quantum mechanics and gravitation is loop quantum gravity (LQG). This theory involves an attempt to quantize the geometry of spacetime itself. After all, lots of things in quantum mechanics become quantized, such as harmonic oscillator energy levels or angular momentum states, so why not spacetime geometry itself? The intuition is the following: imagine localizing a particle with precision L L LLL. Heisenberg uncertainty
17 17 ^(17){ }^{17}17 In brief, Feynman's approach to quantum mechanics involves considering every possible path a particle can take in getting between two points. We compute the classical action S i S i S_(i)S_{i}Si for each trajectory and then build the quantum amplitude for the particle to travel between the two points by summing a factor e i S i / e i S i / e^(iS_(i)//ℏ)\mathrm{e}^{\mathrm{i} S_{i} / \hbar}eiSi/ for every possible trajectory, to give an amplitude A = i e i S i / A = i e i S i / A=sum_(i)e^(iS_(i)//ℏ)\mathcal{A}=\sum_{i} \mathrm{e}^{\mathrm{i} S_{i} / \hbar}A=ieiSi/. In this way, the amplitude for a quantum mechanical process is built by a sum over all possible trajectories. This picture is described in more detail in our Quantum Field Theory for the Gifted Amateur (2014)
18 18 ^(18){ }^{18}18 The Hamilton-Jacobi equation in classical particle mechanics reads H = H = H=H=H= S t S t -(del S)/(del t)-\frac{\partial S}{\partial t}St. It describes the evolution of the function S S SSS (equal to the classical action) resulting from a Hamiltonian function H H HHH. This is the only formulation of classical mechanics that lation of classical mechanics that represents particle motion in terms of the properties of a wave with a phase determined by S S SSS. There is no coincidence then, that Schrödinger's equation closely resembles the Hamilton-Jacobi equation. The version of the HamiltonJacobi equation in eqn 49.39, suitable for computations in general relativity, involves the evolution of the function S S SSS with respect to the 3 -metric components, driven by a γ ( 3 ) R γ ( 3 ) R sqrtgamma^((3))R\sqrt{\gamma}{ }^{(3)} Rγ(3)R. More details of the Hamilton-Jacobi equation in classical particle mechanics can be found in Landau and Lifshitz (volume I, 1976).
19 19 ^(19){ }^{19}19 This phrase is from Rovelli and Vi dotto (2015), a highly readable introduction to LQG.
20 20 ^(20){ }^{20}20 The total angular momentum (squared) is related to the operator L 2 = ( L ^ 1 ) 2 + ( L ^ 2 ) 2 + ( L ^ 3 ) 2 L 2 = L ^ 1 2 + L ^ 2 2 + L ^ 3 2 vec(vec(L))^(2)=( hat(L)^(1))^(2)+( hat(L)^(2))^(2)+( hat(L)^(3))^(2)\overrightarrow{\vec{L}}^{2}=\left(\hat{L}^{1}\right)^{2}+\left(\hat{L}^{2}\right)^{2}+\left(\hat{L}^{3}\right)^{2}L2=(L^1)2+(L^2)2+(L^3)2.
(a)

(b)
Fig. 49.10 (a) The quantization of a tetrahedral region of spacetime. (b) Overlapping volumes can be reduced to Overlapping volumes can be reduced to
a graph in which closed paths over the a graph in which closed paths over
nodes are the loops of the theory.
would demand that L Δ p L Δ p L Delta p:)ℏL \Delta p\rangle \hbarLΔp, and since ( Δ p ) 2 = p 2 p 2 ( Δ p ) 2 = p 2 p 2 (Delta p)^(2)=(:p^(2):)-(:p:)^(2)(\Delta p)^{2}=\left\langle p^{2}\right\rangle-\langle p\rangle^{2}(Δp)2=p2p2, this means p 2 > ( / L ) 2 p 2 > ( / L ) 2 (:p^(2):) > (ℏ//L)^(2)\left\langle p^{2}\right\rangle>(\hbar / L)^{2}p2>(/L)2. As we localize the particle in a smaller and smaller region, its momentum will go up and so will its energy E E EEE, and it will become ultra-relativistic so that E p c E p c E~~pcE \approx p cEpc and hence E E EEE will exceed c / L c / L ℏc//L\hbar c / Lc/L. Energy E E EEE acts as gravitational mass via E = M c 2 E = M c 2 E=Mc^(2)E=M c^{2}E=Mc2 and a concentrated mass in this small region L L LLL will become a black hole with Schwarzschild radius R = G M / c 2 R = G M / c 2 R=GM//c^(2)R=G M / c^{2}R=GM/c2. This horizon will reach L L LLL when L = ( G / c 2 ) ( c / L ) / c 2 L = G / c 2 ( c / L ) / c 2 L=(G//c^(2))(ℏc//L)//c^(2)L=\left(G / c^{2}\right)(\hbar c / L) / c^{2}L=(G/c2)(c/L)/c2, i.e. L = P L = P L=ℓ_(P)L=\ell_{\mathrm{P}}L=P so that the particle is localized within a Planck length. Thus, we conclude that though spacetime might be smooth at length scales above the Planck length, things below the Planck length are 'hidden inside its own mini-black hole, 19 19 ^(19){ }^{19}19
So how do we go about quantizing spacetime? LQG uses the quantummechanical intuition that it is the commutation relations between operators that give rise to quantization conditions. For example, the components of angular momentum L ^ i ( i = 1 , 2 , 3 ) L ^ i ( i = 1 , 2 , 3 ) hat(L)^(i)(i=1,2,3)\hat{L}^{i}(i=1,2,3)L^i(i=1,2,3) obey the commutation relation
(49.41) [ L ^ i , L ^ j ] = i ε i j k L ^ k (49.41) L ^ i , L ^ j = i ε i j k L ^ k {:(49.41)[ hat(L)^(i), hat(L)^(j)]=iℏepsi^(ijk) hat(L)^(k):}\begin{equation*} \left[\hat{L}^{i}, \hat{L}^{j}\right]=\mathrm{i} \hbar \varepsilon^{i j k} \hat{L}^{k} \tag{49.41} \end{equation*}(49.41)[L^i,L^j]=iεijkL^k
and this leads to the quantization of angular momentum. Moreover, they lead to the interesting feature that you can know the (square of the) total angular momentum 20 20 ^(20){ }^{20}20 and any one component of the angular momentum (such as L ^ 3 L ^ 3 hat(L)^(3)\hat{L}^{3}L^3 ), but not the other components of the angular momentum. In LQG, we try and do the same thing with an element of space, and we start with a very simple three-dimensional shape: the tetrahedron shown in Fig. 49.10(a). One way of describing this tetrahedron is by using four vectors [see Fig. 49.10(a)] which we will call L a L a vec(L)_(a)\vec{L}_{a}La where a = 1 , 2 , 3 , 4 a = 1 , 2 , 3 , 4 a=1,2,3,4a=1,2,3,4a=1,2,3,4; the directions of these vectors are perpendicular to the four faces of the tetrahedron and the magnitudes are equal to the area of the four faces of the tetrahedron. These vectors satisfy the condition
(49.42) a = 1 4 L a = 0 (49.42) a = 1 4 L a = 0 {:(49.42)sum_(a=1)^(4) vec(L)_(a)=0:}\begin{equation*} \sum_{a=1}^{4} \vec{L}_{a}=0 \tag{49.42} \end{equation*}(49.42)a=14La=0
and one can show that the volume V V VVV of the tetrahedron is given by V 2 = 2 9 L 1 × L 2 L 3 V 2 = 2 9 L 1 × L 2 L 3 V^(2)=(2)/(9) vec(L)_(1)xx vec(L)_(2)* vec(L)_(3)V^{2}=\frac{2}{9} \vec{L}_{1} \times \vec{L}_{2} \cdot \vec{L}_{3}V2=29L1×L2L3.
Geometry itself can then be quantized by upgrading these vectors into operators and imposing commutation relations between their components. One possible quantization scheme for our tetrahedron would then be to write
(49.43) [ L ^ a i , L ^ b j ] = i δ a b 0 2 ε i j k L ^ a k (49.43) L ^ a i , L ^ b j = i δ a b 0 2 ε i j k L ^ a k {:(49.43)[ hat(L)_(a)^(i), hat(L)_(b)^(j)]=idelta_(ab)ℓ_(0)^(2)epsi^(ijk) hat(L)_(a)^(k):}\begin{equation*} \left[\hat{L}_{a}^{i}, \hat{L}_{b}^{j}\right]=\mathrm{i} \delta_{a b} \ell_{0}^{2} \varepsilon^{i j k} \hat{L}_{a}^{k} \tag{49.43} \end{equation*}(49.43)[L^ai,L^bj]=iδab02εijkL^ak
Here 0 0 ℓ_(0)\ell_{0}0 is a constant, which must have dimensions of length (since L ^ a i L ^ a i hat(L)_(a)^(i)\hat{L}_{a}^{i}L^ai is an operator whose eigenvalue has the dimensions of area), and it turns out that it should be a constant multiplied by the Planck length. Equation 49.43 is simply a postulate, but what would it imply if we accepted it? The first thing to note is that the area of the faces of this tetrahedron would behave analogously to angular momentum in ordinary quantum mechanics. Thus, the area of the a a aaa th face of a tetrahedron must
be quantized with eigenvalues
(49.44) A a = 0 2 j a ( j a + 1 ) (49.44) A a = 0 2 j a j a + 1 {:(49.44)A_(a)=ℓ_(0)^(2)sqrt(j_(a)(j_(a)+1)):}\begin{equation*} A_{a}=\ell_{0}^{2} \sqrt{j_{a}\left(j_{a}+1\right)} \tag{49.44} \end{equation*}(49.44)Aa=02ja(ja+1)
with j a = 0 , 1 / 2 , 1 , 3 / 2 j a = 0 , 1 / 2 , 1 , 3 / 2 j_(a)=0,1//2,1,3//2dotsj_{a}=0,1 / 2,1,3 / 2 \ldotsja=0,1/2,1,3/2 The second thing one can conclude is that even if you know one Cartesian component of the area of one face of the tetrahedron, you won't know any of the other Cartesian components of the area of the other faces (for exactly the same reason that you can only know one component of the angular momentum). The normal vectors to each face of the tetrahedron are therefore known only partially and so these faces somehow blurrily shimmer with quantum uncertainty! We therefore deduce that geometry becomes fuzzy when you get down to the Planck scale; if your ambitions stretch to determining every aspect of the geometry of a shape at the smallest length scales, then you will be limited by fundamental quantum uncertainty. Thus, even though our argument has been formulated in terms of a tetrahedron, it would work if we had chosen some other shape and so we can't deduce that the smallest scale really does consist of a network of tetrahedra since we can't have precise information about microscopic quantum geometry. This then is the consequence of LQG: the Riemannian geometry that gives rise to gravitation must be replaced with quantum geometry that involves an inherent uncertainty at the Planck scale in at least some lengths, angles, and areas. In order to describe the curved spacetime of gravitation, we must consider a mesh of spacetime volumes such as the tetrahedron discussed above. These meshes can be reduced to graphs whose lines are analogous to lines of force. The loops in LQG are the closed paths linking nodes in these graphs [Fig. 49.10(b)].
Let's return to the tetrahedron, and imagine we know the eigenvalue j a j a j_(a)j_{a}ja for each vector operator L ^ a L ^ a hat(vec(L))_(a)\hat{\vec{L}}_{a}L^a, remembering that we also know that these vectors are subject to a closure property (eqn 49.42). The volume operator V ^ V ^ hat(V)\hat{V}V^, defined via V ^ 2 = 2 9 L ^ 1 × L ^ 2 L ^ 3 V ^ 2 = 2 9 L ^ 1 × L ^ 2 L ^ 3 hat(V)^(2)=(2)/(9) hat(vec(L))_(1)xx hat(vec(L))_(2)* hat(vec(L))_(3)\hat{V}^{2}=\frac{2}{9} \hat{\vec{L}}_{1} \times \hat{\vec{L}}_{2} \cdot \hat{\vec{L}}_{3}V^2=29L^1×L^2L^3, commutes with the closure operator C ^ C ^ hat(C)\hat{C}C^ (defined by C ^ = a = 1 4 L ^ a C ^ = a = 1 4 L ^ a hat(C)=sum_(a=1)^(4) hat(vec(L))_(a)\hat{C}=\sum_{a=1}^{4} \hat{\vec{L}}_{a}C^=a=14L^a ) and so it turns out that we can write states as | j 1 , j 2 , j 3 , j 4 , v j 1 , j 2 , j 3 , j 4 , v |j_(1),j_(2),j_(3),j_(4),v:)\left|j_{1}, j_{2}, j_{3}, j_{4}, v\right\rangle|j1,j2,j3,j4,v, a common eigenstate of the four total area operators and the volume operator (with eigenvalue v v vvv ). Thus, there is a fundamental 'quantum of space', so that the tetrahedron can grow or shrink only in discrete steps.
LQG remains a theory under construction and so it is not yet clear how well it describes our observations. Perhaps most seriously, we currently do not have a semiclassical limit of the theory that recovers general relativity. In addition to not reproducing the physical predictions of general relativity, LQG has not yet given rise to any prediction not made by the Standard Model. As a result, the jury is still out on this theory, as it is on all of the quantum approaches to gravitation. 21 21 ^(21){ }^{21}21
21 21 ^(21){ }^{21}21 Rovelli and Vidotto (2015) give much more detail and discussion of LQG in a highly engaging form. Readers interested in further alternative (and techniested in further alternative (and techni-
cal) approaches to quantum gravity can cal) approaches to quantum gravity can
consult the vibrant literature on the consult the vibrant literature on the
subject. For example, for an introducsubject. For example, for an introduc-
tion to twistor theory see the books by tion to twistor theory see the books by
Penrose (2004) and by Zee (2013); for an introduction to Regge calculus see Misner, Thorne, and Wheeler (1973); for an introduction to spinors in relativity see Wald (1984) and also Misner, Thorne, and Wheeler (1973).

49.7 Anti-de Sitter spacetime

In studying quantum mechanics, we often use the idea of a particle confined to a box. 22 22 ^(22){ }^{22}22 Clearly, confining the gravitational field to a box is
22 22 ^(22){ }^{22}22 Our discussion in this section follows that of Zee, which can be consulted for further details.
23 23 ^(23){ }^{23}23 We shall write d d ddd-dimensional AdS spacetime as AdS d AdS d AdS^(d)\mathrm{AdS}^{d}AdSd. The holographic principle was originally proposed by Gerard 't Hooft (1946- ). It says that the description of a d d ddd-dimensional volume of space can be encoded on its (d-1)-dimensional boundary (similarly ( d 1 ) ( d 1 ) (d-1)(d-1)(d1)-dimensional boundary (similarly to how a three-dimensional image is captured in a two-dimensional inter-
ference pattern in optical holography). ference pattern in optical holography).
AdS d AdS d AdS^(d)\mathrm{AdS}^{d}AdSd spacetime represents a particularly vivid example of the holographic principle.
24 CFT = 24 CFT = ^(24)CFT={ }^{24} \mathrm{CFT}=24CFT= conformal field theory, which is to say, a conformally invariant gauge field theory. AdS/CFT correspondence was originally proposed for spin the ory in AdS space by Juan Maldacena (1968-). The idea is that some theories of quantum gravity are equivalent to quantum theories with no gravitational interaction in fewer dimensions. It has been suggested that the correspondence might also provide insight into research might also provide insight into research areas such as condensed matter physics in the future. See Năstase (2017) for further details.
Fig. 49.11 de Sitter spacetime as a hyperboloid embedded in ( 4 + 1 ) ( 4 + 1 ) (4+1)(4+1)(4+1) dimensional Minkowski spacetime.
25 25 ^(25){ }^{25}25 In Chapter 15, we described a spacetime of constant curvature as having a Riemann tensor determined by the Ricci scalar. Equivalently, we described it in Chapter 16 as having a Riemann tensor with components
R μ ν α β = K ( g μ α g ν β g μ β g ν α ) R μ ν α β = K g μ α g ν β g μ β g ν α R_(mu nu alpha beta)=K(g_(mu alpha)g_(nu beta)-g_(mu beta)g_(nu alpha))R_{\mu \nu \alpha \beta}=K\left(g_{\mu \alpha} g_{\nu \beta}-g_{\mu \beta} g_{\nu \alpha}\right)Rμναβ=K(gμαgνβgμβgνα)
In this case, the constant K = α 2 K = α 2 K=alpha^(-2)K=\alpha^{-2}K=α2.
not something we can straightforwardly do in our own spacetime. It is possible to confine gravity in Anti-de Sitter (AdS) spacetime, whose geometry is related to the de Sitter spacetime we met in Chapter 15. A remarkable feature of d d ddd-dimensional AdS spacetime is that it possesses a spatial boundary made up of Minkowski spacetime with one fewer dimension. 23 23 ^(23){ }^{23}23 AdS spacetime has caught the imaginations of many relativists, especially after the discovery that the physics of some gravitational theories in ( 4 + 1 ) ( 4 + 1 ) (4+1)(4+1)(4+1)-dimensional AdS spacetime ( AdS 5 ) AdS 5 (AdS^(5))\left(\mathrm{AdS}^{5}\right)(AdS5) can be mapped onto (3+1)-dimensional Minkowski space. This is known as the AdS/CFT correspondence 24 24 ^(24){ }^{24}24 and is an active area of research into quantum theories of gravity. AdS spacetime is not a quantum theory of gravity in itself, but might be an important ingredient, and we discuss it in this section.
Before we describe AdS space, let's revisit the view of de Sitter spacetime discussed the exercises for Chapter 18. We saw there that model universes driven by a non-zero cosmological constant Λ Λ Lambda\LambdaΛ can be represented geometrically using this spacetime. de Sitter spacetime can be visualized as a hyperboloid defined by
(49.45) T 2 + X 2 + Y 2 + Z 2 + W 2 = α 2 (49.45) T 2 + X 2 + Y 2 + Z 2 + W 2 = α 2 {:(49.45)-T^(2)+X^(2)+Y^(2)+Z^(2)+W^(2)=alpha^(2):}\begin{equation*} -T^{2}+X^{2}+Y^{2}+Z^{2}+W^{2}=\alpha^{2} \tag{49.45} \end{equation*}(49.45)T2+X2+Y2+Z2+W2=α2
embedded in a flat five-dimensional space with metric
(49.46) d s 2 = d T 2 + d X 2 + d Y 2 + d Z 2 + d W 2 (49.46) d s 2 = d T 2 + d X 2 + d Y 2 + d Z 2 + d W 2 {:(49.46)ds^(2)=-dT^(2)+dX^(2)+dY^(2)+dZ^(2)+dW^(2):}\begin{equation*} \mathrm{d} s^{2}=-\mathrm{d} T^{2}+\mathrm{d} X^{2}+\mathrm{d} Y^{2}+\mathrm{d} Z^{2}+\mathrm{d} W^{2} \tag{49.46} \end{equation*}(49.46)ds2=dT2+dX2+dY2+dZ2+dW2
The embedding is shown from Fig. 49.11 with two dimensions suppressed. Going through the embedding routine from Appendix D, the result of eliminating a dimension is a line element with the form
(49.47) d s 2 = ( η μ ν η μ α η ν β X α X β η μ ν X μ X ν α 2 ) (49.47) d s 2 = η μ ν η μ α η ν β X α X β η μ ν X μ X ν α 2 {:(49.47)ds^(2)=(eta_(mu nu)-(eta_(mu alpha)eta_(nu beta)X^(alpha)X^(beta))/(eta_(mu nu)X^(mu)X^(nu)-alpha^(2))):}\begin{equation*} \mathrm{d} s^{2}=\left(\eta_{\mu \nu}-\frac{\eta_{\mu \alpha} \eta_{\nu \beta} X^{\alpha} X^{\beta}}{\eta_{\mu \nu} X^{\mu} X^{\nu}-\alpha^{2}}\right) \tag{49.47} \end{equation*}(49.47)ds2=(ημνημαηνβXαXβημνXμXνα2)
where indices in the last equation run from 0 to 3 . The topology of this spacetime is built from three-dimensional spheres S 3 S 3 S^(3)S^{3}S3 that start at T T T rarr-ooT \rightarrow-\inftyT with infinite radius, shrink down to a minimum radius α α alpha\alphaα, before start expanding again for T T T rarr ooT \rightarrow \inftyT. We call this topology R 1 × S 3 R 1 × S 3 R^(1)xxS^(3)\mathbb{R}^{1} \times S^{3}R1×S3 (i.e. a real line representing the time combined with 3 -spheres at every instant). This is a spacetime of constant curvature 25 25 ^(25){ }^{25}25 and consistent with a cosmological constant Λ = R / 4 Λ = R / 4 Lambda=R//4\Lambda=R / 4Λ=R/4 and Einstein tensor with components G μ ν = 1 4 R g μ ν G μ ν = 1 4 R g μ ν G_(mu nu)=-(1)/(4)Rg_(mu nu)G_{\mu \nu}=-\frac{1}{4} R g_{\mu \nu}Gμν=14Rgμν, where R > 0 R > 0 R > 0R>0R>0.
We found in Chapter 18 that it is possible to represent models with different spatial curvatures by covering the hyperboloid with coordinates that make different cuts through the spacetime. This versatility of de Sitter spacetime follows from the high degree of symmetry that it possesses. In fact, another way of expressing its constant curvature is to say that de Sitter spacetime is an example of a maximally symmetric space. Here 'maximal symmetry' means that the space has the same number of symmetries as Euclidean space. A sphere also has this property and it is evident that the de Sitter spacetime can be thought of as a version of a higher dimensional sphere in Minkowski space, with the
transformation T 2 T 2 T 2 T 2 T^(2)rarr-T^(2)T^{2} \rightarrow-T^{2}T2T2 providing a means of swapping between a sphere in Euclidean space and de Sitter geometry in Minkowski spacetime.
Anti-de Sitter spacetime can be thought of as de Sitter spacetime with R < 0 R < 0 R < 0R<0R<0, corresponding to a negative cosmological constant Λ . 26 Λ . 26 -Lambda.^(26)-\Lambda .^{26}Λ.26 It has topology S 1 × R 3 S 1 × R 3 S^(1)xxR^(3)S^{1} \times \mathbb{R}^{3}S1×R3 and can be represented as a hyperboloid
(49.50) T 2 + X 2 + Y 2 + Z 2 U 2 = α 2 (49.50) T 2 + X 2 + Y 2 + Z 2 U 2 = α 2 {:(49.50)-T^(2)+X^(2)+Y^(2)+Z^(2)-U^(2)=-alpha^(2):}\begin{equation*} -T^{2}+X^{2}+Y^{2}+Z^{2}-U^{2}=-\alpha^{2} \tag{49.50} \end{equation*}(49.50)T2+X2+Y2+Z2U2=α2
This is to be embedded in a spacetime with line element
(49.51) d s 2 = d T 2 + d X 2 + d Y 2 + d Z 2 d U 2 (49.51) d s 2 = d T 2 + d X 2 + d Y 2 + d Z 2 d U 2 {:(49.51)ds^(2)=-dT^(2)+dX^(2)+dY^(2)+dZ^(2)-dU^(2):}\begin{equation*} \mathrm{d} s^{2}=-\mathrm{d} T^{2}+\mathrm{d} X^{2}+\mathrm{d} Y^{2}+\mathrm{d} Z^{2}-\mathrm{d} U^{2} \tag{49.51} \end{equation*}(49.51)ds2=dT2+dX2+dY2+dZ2dU2
which is a ( 3 + 2 ) ( 3 + 2 ) (3+2)(3+2)(3+2)-dimensional Minkowski space with two timelike variables (i.e. variables that enter the metric with a minus sign). The embedding is shown in Fig. 49.12, with two dimensions suppressed. The spacetime resembling the de Sitter hyperboloid turned on its side. The embedding routine now results in a line element
(49.52) d s 2 = ( η μ ν η μ α η ν β X α X β η μ ν X μ X ν + α 2 ) (49.52) d s 2 = η μ ν η μ α η ν β X α X β η μ ν X μ X ν + α 2 {:(49.52)ds^(2)=(eta_(mu nu)-(eta_(mu alpha)eta_(nu beta)X^(alpha)X^(beta))/(eta_(mu nu)X^(mu)X^(nu)+alpha^(2))):}\begin{equation*} \mathrm{d} s^{2}=\left(\eta_{\mu \nu}-\frac{\eta_{\mu \alpha} \eta_{\nu \beta} X^{\alpha} X^{\beta}}{\eta_{\mu \nu} X^{\mu} X^{\nu}+\alpha^{2}}\right) \tag{49.52} \end{equation*}(49.52)ds2=(ημνημαηνβXαXβημνXμXν+α2)
with μ = 0 μ = 0 mu=0\mu=0μ=0 to 3 again. In general, we can swap back and forth between results in de Sitter space and anti-de Sitter space by swapping α 2 α 2 alpha^(2)rarr\alpha^{2} \rightarrowα2 α 2 α 2 -alpha^(2)-\alpha^{2}α2.
The unusual shape of AdS AdS AdS\operatorname{AdS}AdS spacetime allows for the existence of closed timelike loops. This is undesirable owing to the violation to causality that it allows. However, the tube-like shape of AdS spacetime can effectively be cut and unrolled with a good choice of coordinates. (We say the topology has been changed to R 4 R 4 R^(4)\mathbb{R}^{4}R4 as a result.) The standard choice of coordinates that does this for three-dimensional AdS spacetime ( AdS 3 AdS 3 AdS^(3)\mathrm{AdS}^{3}AdS3 ) is
(49.53) T = ( 1 + r 2 ) 1 2 cos t , U = ( 1 + r 2 ) 1 2 sin t , X = r cos θ , Y = r sin θ (49.53) T = 1 + r 2 1 2 cos t , U = 1 + r 2 1 2 sin t , X = r cos θ , Y = r sin θ {:(49.53)T=(1+r^(2))^((1)/(2))cos t","quad U=(1+r^(2))^((1)/(2))sin t","quad X=r cos theta","quad Y=r sin theta:}\begin{equation*} T=\left(1+r^{2}\right)^{\frac{1}{2}} \cos t, \quad U=\left(1+r^{2}\right)^{\frac{1}{2}} \sin t, \quad X=r \cos \theta, \quad Y=r \sin \theta \tag{49.53} \end{equation*}(49.53)T=(1+r2)12cost,U=(1+r2)12sint,X=rcosθ,Y=rsinθ
In terms of these coordinates, the line element becomes
(49.54) d s 2 = ( 1 + r 2 ) d t 2 + d r 2 1 + r 2 + r 2 d θ 2 (49.54) d s 2 = 1 + r 2 d t 2 + d r 2 1 + r 2 + r 2 d θ 2 {:(49.54)ds^(2)=-(1+r^(2))dt^(2)+(dr^(2))/(1+r^(2))+r^(2)dtheta^(2):}\begin{equation*} \mathrm{d} s^{2}=-\left(1+r^{2}\right) \mathrm{d} t^{2}+\frac{\mathrm{d} r^{2}}{1+r^{2}}+r^{2} \mathrm{~d} \theta^{2} \tag{49.54} \end{equation*}(49.54)ds2=(1+r2)dt2+dr21+r2+r2 dθ2

Example 49.11

We can produce a conformal version of the AdS line element using the ideas from Chapter 19. First, make the choice r = tan ψ r = tan ψ r=tan psir=\tan \psir=tanψ, then rewrite eqn 49.54 to say
(49.55) d s 2 = 1 cos 2 ψ ( d t 2 + d ψ 2 + sin 2 ψ d θ 2 ) (49.55) d s 2 = 1 cos 2 ψ d t 2 + d ψ 2 + sin 2 ψ d θ 2 {:(49.55)ds^(2)=(1)/(cos^(2)psi)(-dt^(2)+dpsi^(2)+sin^(2)psi(d)theta^(2)):}\begin{equation*} \mathrm{d} s^{2}=\frac{1}{\cos ^{2} \psi}\left(-\mathrm{d} t^{2}+\mathrm{d} \psi^{2}+\sin ^{2} \psi \mathrm{~d} \theta^{2}\right) \tag{49.55} \end{equation*}(49.55)ds2=1cos2ψ(dt2+dψ2+sin2ψ dθ2)
Here ψ ψ psi\psiψ is a latitude-like variable. As the radius-like coordinate r r rrr goes from 0 to oo\infty, the latitude ψ ψ psi\psiψ goes from 0 to π / 2 π / 2 pi//2\pi / 2π/2 (rather than π π pi\piπ, as we might have expected). More colourfully, latitude in AdS spacetime goes from north pole to the equator, not to the south pole. With this observation, we have discovered the boundary AdS spacetime!
26 26 ^(26){ }^{26}26 It has a Riemann tensor with components
R μ ν α β = 1 α 2 ( g μ α g ν β g μ β g ν α ) R μ ν α β = 1 α 2 g μ α g ν β g μ β g ν α R_(mu nu alpha beta)=-(1)/(alpha^(2))(g_(mu alpha)g_(nu beta)-g_(mu beta)g_(nu alpha))R_{\mu \nu \alpha \beta}=-\frac{1}{\alpha^{2}}\left(g_{\mu \alpha} g_{\nu \beta}-g_{\mu \beta} g_{\nu \alpha}\right)Rμναβ=1α2(gμαgνβgμβgνα)
Fig. 49.12 Anti-de Sitter spacetime as a hyperboloid embedded in (3+2)dimensional Minkowski spacetime.
27 27 ^(27){ }^{27}27 See exercises for a derivation. Notice the resemblance to what we previously called the Poincaré line element: d s 2 = d s 2 = ds^(2)=\mathrm{d} s^{2}=ds2= ( d r 2 + d x 2 ) / r 2 d r 2 + d x 2 / r 2 (dr^(2)+dx^(2))//r^(2)\left(\mathrm{d} r^{2}+\mathrm{d} x^{2}\right) / r^{2}(dr2+dx2)/r2.

(b)
Fig. 49.13 (a) Anti-de Sitter spacetime with its boundary. (b) A massive particle bounces owing to the boundary in spacetime
28 28 ^(28){ }^{28}28 This is of the form (kinetic energy) + (potential energy) = = === const.
29 29 ^(29){ }^{29}29 The latter was how it has been regarded for most of its lifetime. AdS spacetime was originally discussed in the 1920 s by de Sitter (unhelpfully, both de Sitter and AdS were referred to as 'de Sitter spacetimes' for this reason). The spacetime was discovered in dependently by Tullio Levi-Civita.
30 30 ^(30){ }^{30}30 'You may have enjoyed this course and decided that you would like to do your thesis research in general relativity. DON'T. Einstein spent the last thirty years of his life working on general relativity, and it led to nothing. And he was smarter than you.' So said Sidney Coleman (1937-2007), albeit in 1970.
To examine the consequence of the boundary seen in the last example, it is helpful to (once again!) recast AdS spacetime, this time in Poincaré coordinates 27 27 ^(27){ }^{27}27 as
(49.56) d s 2 = 1 w 2 ( d t 2 + d x 2 + d w 2 ) (49.56) d s 2 = 1 w 2 d t 2 + d x 2 + d w 2 {:(49.56)ds^(2)=(1)/(w^(2))(-dt^(2)+dx^(2)+dw^(2)):}\begin{equation*} \mathrm{d} s^{2}=\frac{1}{w^{2}}\left(-\mathrm{d} t^{2}+\mathrm{d} x^{2}+\mathrm{d} w^{2}\right) \tag{49.56} \end{equation*}(49.56)ds2=1w2(dt2+dx2+dw2)
The boundary now occurs at w = 0 w = 0 w=0w=0w=0. We can see from this coordinate system how a slice made at a particular value of w w www is simply Minkowski space with one fewer spatial dimension. This idea is shown in Fig. 49.13(a), where AdS spacetime terminates on this flat boundary.
To see the physical influence of the boundary, we shall shoot photons and massive particles at it and see what happens.

Example 49.12

A light beam sent from a point w 0 w 0 w_(0)w_{0}w0 to the boundary will, if reflected by a mirror w = 0 w = 0 w=0w=0w=0 come back after a coordinate time 2 w 0 2 w 0 2w_(0)2 w_{0}2w0. Things are different for a massive particle. A massive particle obeys the usual condition on its velocity u u = 1 u u = 1 u*u=-1\boldsymbol{u} \cdot \boldsymbol{u}=-1uu=1, which gives us the equation of motion
(49.57) t ˙ 2 w ˙ 2 = w 2 (49.57) t ˙ 2 w ˙ 2 = w 2 {:(49.57)t^(˙)^(2)-w^(˙)^(2)=w^(2):}\begin{equation*} \dot{t}^{2}-\dot{w}^{2}=w^{2} \tag{49.57} \end{equation*}(49.57)t˙2w˙2=w2
where dots indicate a derivative with respect to proper time. Owing to the absence of the variable t t ttt in the metric components we have a Killing vector / t / t del//del t\partial / \partial t/t and a conservation law u t = g t t u t = u t = g t t u t = u_(t)=g_(tt)u^(t)=u_{t}=g_{t t} u^{t}=ut=gttut= const., or
(49.58) 1 w 2 d t d τ = (const.). (49.58) 1 w 2 d t d τ =  (const.).  {:(49.58)-(1)/(w^(2))((d)t)/((d)tau)=" (const.). ":}\begin{equation*} -\frac{1}{w^{2}} \frac{\mathrm{~d} t}{\mathrm{~d} \tau}=\text { (const.). } \tag{49.58} \end{equation*}(49.58)1w2 dt dτ= (const.). 
Let's write this latter equation t ˙ = w 2 / b t ˙ = w 2 / b t^(˙)=w^(2)//b\dot{t}=w^{2} / bt˙=w2/b, with b b bbb a constant length determined by the initial conditions. Substituting the conservation law back into eqn 49.57 we obtain an equation of motion in terms of the coordinate time 28 28 ^(28){ }^{28}28 of
(49.59) ( w t ) 2 + b 2 w 2 = 1 (49.59) w t 2 + b 2 w 2 = 1 {:(49.59)((del w)/(del t))^(2)+(b^(2))/(w^(2))=1:}\begin{equation*} \left(\frac{\partial w}{\partial t}\right)^{2}+\frac{b^{2}}{w^{2}}=1 \tag{49.59} \end{equation*}(49.59)(wt)2+b2w2=1
This describes the motion of a massive particle in a Newtonian potential with positive potential energy V ( w ) = ( b / w ) 2 V ( w ) = ( b / w ) 2 V(w)=(b//w)^(2)V(w)=(b / w)^{2}V(w)=(b/w)2, which diverges as the boundary w = 0 w = 0 w=0w=0w=0 is approached. As shown in Fig. 49.13(b), a particle set in motion in this potential never reaches the boundary. It must stop and turn back at the point w = b w = b w=bw=bw=b.
The last example hints at how the energy of a particle might be something that can be mapped to a characteristic value of w = b w = b w=bw=bw=b. In this way, the boundary of AdS AdS AdS\operatorname{AdS}AdS space is able to encode the physics of the bulk.
Ultimately nobody yet knows whether AdS is an essential part of quantum gravity, or an intriguing curiosity. 29 29 ^(29){ }^{29}29 Of course, the same could currently be said of each of the theories we have discussed in this chapter. It is perhaps not too optimistic to hope that, ultimately, experiment will provide the final word. 30 30 ^(30){ }^{30}30

49.8 Our current best guess

Which model best describes our Universe? In the last few decades, cosmology has gone from being a highly speculative field (in the 1950s
people were arguing about whether the Universe had a beginning or not) to one which is now heavily constrained by very well tied down measurements. Its practitioners claim that we have now entered the era of precision cosmology. The cosmic background explorer (COBE) satellite revealed in 1992 that the cosmic microwave background (CMB) exhibits a beautiful blackbody spectrum with a temperature of 2.725 K , and this spectrum is pretty smooth across the sky (there are fluctuations but their amplitude is δ T / T 10 5 δ T / T 10 5 delta T//T∼10^(-5)\delta T / T \sim 10^{-5}δT/T105 ). In 2001, the Wilkinson Microwave Anisotropy Probe (WMAP) was launched to measure the angular spectrum of these fluctuations in greater detail, and 2009 saw the launch of the Planck satellite, which improved on these measurements even further. These data fit well with a Big-Bang cosmological model known as Λ C D M Λ C D M LambdaCDM\Lambda \mathbf{C D M}ΛCDM, where Λ Λ Lambda\LambdaΛ refers to the cosmological constant and CDM refers to cold dark matter.
Let's unpack these terms. First, the presence of Λ Λ Lambda\LambdaΛ in the model is consistent with the experimental observation that the expansion of the Universe is currently accelerating, which has been determined by measurements of type-Ia supernovae used as 'standardized candles'. An inflationary period in the Universe's history mandates that the Universe must be very close to its critical density, and the models show that Ω 0 Ω 0 Omega_(0)\Omega_{0}Ω0 (the ratio of the Universe's density to the critical density) is 0.999 ( 2 ) 0.999 ( 2 ) 0.999(2)0.999(2)0.999(2). From a variety of measurements, it is found that the baryonic 31 31 ^(31){ }^{31}31 matter in the Universe has a density of Ω B = 0.05 Ω B = 0.05 Omega_(B)=0.05\Omega_{\mathrm{B}}=0.05ΩB=0.05. Now we come to CDM, non-baryonic matter which resides in the halos of galaxies. 32 32 ^(32){ }^{32}32 The CDM density comes out to be Ω M = 0.26 Ω M = 0.26 Omega_(M)=0.26\Omega_{\mathrm{M}}=0.26ΩM=0.26, much larger than the baryonic density, but the sum of the two does not yield Ω 0 1 Ω 0 1 Omega_(0)~~1\Omega_{0} \approx 1Ω01, so that Ω Λ = 0.69 Ω Λ = 0.69 Omega_(Lambda)=0.69\Omega_{\Lambda}=0.69ΩΛ=0.69 has to make up the difference. The cosmological term has been termed dark energy; it is believed to have a very low density, but is completely uniform across all space 33 33 ^(33){ }^{33}33 and so dominates the overall mass/energy of the Universe. It might be thought to be the energy of the quantum vacuum, though in 1968 Zel'dovich pointed out that though the energy of the quantum vacuum could contribute to Λ Λ Lambda\LambdaΛ, it would result in an energy density fifty orders of magnitude larger than the critical density.
Here is our best guess of how all this fits together: at around 10 32 s 10 32 s 10^(-32)s10^{-32} \mathrm{~s}1032 s after the Big Bang there is a period of inflation 34 34 ^(34){ }^{34}34 in which the Universe expands exponentially, leading to a scale-invariant spectrum of gravitational waves (not yet detected in experiments, but may well be found in the coming years). This is terminated only when the scalar field potential energy is converted into particles. The resulting quark soup condenses into hadrons at t 10 5 s t 10 5 s t~~10^(-5)st \approx 10^{-5} \mathrm{~s}t105 s, with baryons and antibaryons in roughly equal number, and as abundant as thermal photons. However, at t 1 s t 1 s t∼1st \sim 1 \mathrm{~s}t1 s, as the Universe cools below the temperature of the lightest baryon, most baryons and antibaryons annihilate each other, leaving a small excess 35 35 ^(35){ }^{35}35 of baryons over antibaryons in the few 36 36 ^(36){ }^{36}36 that remain. From about 0.01 s to 20 minutes, we have a period known as Big-Bang nucleosynthesis (BBN) when the lightest elements form, mostly 4 He 4 He ^(4)He{ }^{4} \mathrm{He}4He, but also some deuterium (D), 3 Li 3 Li ^(3)Li{ }^{3} \mathrm{Li}3Li, and 7 Li 7 Li ^(7)Li{ }^{7} \mathrm{Li}7Li. The Universe cools further and the baryons fall into the gravitational potential wells produced by CDM
31 31 ^(31){ }^{31}31 Reminder: Baryons are particles like protons and neutrons which are composites of quarks, but this is a shorthand for the 'ordinary' matter in the Universe.
32 32 ^(32){ }^{32}32 Hot dark matter models were ruled out fairly quickly. CDM is cold, meaning that the dark matter particles are moving at speeds c c ≪c\ll cc, and so become trapped in the gravitational potential wells of galaxies. They interact with gravity, but not with the strong or electromagnetic force (so we can't see them); they may or may not couple via the weak interaction. Since no-one has directly detected a CDM particle, we don't really know. The only reason we believe they are in halos around galaxies is the effect they have on the rotations of stars around galaxies via measurements that are known as rotation curves.
33 33 ^(33){ }^{33}33 This is unlike ordinary matter which is strongly clumped in stars and galaxies, with lots of regions of space empty of ordinary matter.
34 34 ^(34){ }^{34}34 Inflation is discussed in detail in Chapter 41.
35 35 ^(35){ }^{35}35 This is called baryogenesis. It is thought that non-equilibrium interactions that violate baryon number conservation, charge (C) conservation, and CP conservation, might allow the Universe to evolve a small net baryon abundance.
36 36 ^(36){ }^{36}36 Today there are only a few baryons per billion photons.
37 37 ^(37){ }^{37}37 There are also photons in the Universe, i.e. radiation. However, recall from Chapter 17 that cold matter density 1 / a 3 1 / a 3 prop1//a^(3)\propto 1 / a^{3}1/a3, directly due to the volume expansion of the Universe, but radiation density 1 / a 4 1 / a 4 prop1//a^(4)\propto 1 / a^{4}1/a4, which has an extra dependence on the scale factor extra dependence on the scale factor
given by 1 / a 1 / a 1//a1 / a1/a due to cosmological redgiven by 1 / a 1 / a 1//a1 / a1/a due to cosmological red-
shift. Thus, the Universe becomes matshift. Thus, the Universe becomes mat-
ter dominated, since the radiation denter dominated, since the radi
sity decreases more quickly.
38 38 ^(38){ }^{38}38 Thus, we have only a loose understanding of the physics occurring at energies corresponding to the era of baryogenesis, let alone the inflationary era.
particles. At around 380,000 years after the Big Bang, the Universe is much cooler, so neutral atoms start to form (the nuclei find electrons to orbit around them) and the Universe becomes transparent to photons; the CMB dates from this era, giving us a snapshot of the Universe at this time. The expansion of the Universe is dominated 37 37 ^(37){ }^{37}37 by the ordinary matter and dark matter. Both types of matter are 'cold', meaning that they are non-relativistic and pressure-less fluids. The Universe is now reasonably well described by the Einstein-de Sitter model, which we called 'Universe 3' back in Chapter 18. From about 5 Gyr ago, the expansion of the Universe starts to become dominated by the cosmological constant term, and so in other words dark energy takes over and the era of cosmic acceleration began. The Universe is now believed to be 13.80 ( 2 ) 13.80 ( 2 ) 13.80(2)13.80(2)13.80(2) Gyr old and its expansion is accelerating.
Can we believe this Λ CDM Λ CDM LambdaCDM\Lambda \mathrm{CDM}ΛCDM picture? One impressive feature is that the numerical values of the various constants (e.g. the Ω Ω Omega\OmegaΩ-values mentioned above) are pretty consistent between completely independent measurements based on (i) the gravity-driven acoustic oscillations in the CMB (which come from the surface of last scattering, determined at a time 380,000 years after the Big Bang) and on (ii) deuterium abundance (due to nuclear reactions that start to take place about one second after the Big Bang). The very early Universe involves some very high energies and temperatures 38 38 ^(38){ }^{38}38 but the microphysics of the eras of (i) and (ii) above have been tested in the laboratory, so we can have some confidence that this picture might be right.
Λ CDM Λ CDM LambdaCDM\Lambda \mathrm{CDM}ΛCDM is therefore the current 'standard model' of modern cosmology and has survived various tests unscathed. However, it is worth saying that there are still some wrinkles. The abundance of 7 Li 7 Li ^(7)Li{ }^{7} \mathrm{Li}7Li is not quite right, baryogenesis is not completely understood, and different measurements are giving slightly different values of the Hubble constant (they vary between about 67 and 74 km s 1 Mpc 1 74 km s 1 Mpc 1 74kms^(-1)Mpc^(-1)74 \mathrm{~km} \mathrm{~s}^{-1} \mathrm{Mpc}^{-1}74 km s1Mpc1 ), a problem which is called Hubble tension. These problems might 39 39 ^(39){ }^{39}39 go away following more detailed measurements, or perhaps following better understanding of some of the systematic errors. More significant is the fact that the physics of inflation is not well tied down, and our current model of inflation is, at best, an approximation; where does inflation actually come from? Even worse, we have not been able to detect any candidate dark matter particles in experiments (despite intensive and patient searches) and so we don't really know what dark matter is. And as for dark energy, we have even less idea. Is dark energy perhaps simply a phantom effect, with the real reason for cosmic acceleration being a consequence of whatever theory succeeds general relativity?
However our understanding of the Universe evolves in the future, we can be reasonably confident that an unalterable part of the picture will be the existence of the Big-Bang singularity. Thus, in our final chapter, we will turn to the Big Bang and describe how this event fits naturally into our current best theory of gravity: general relativity.

Chapter summary

  • Extra dimensions could exist if compactified on the Planck length scale. They would have an effect on the nature of gravity.
  • String theory describes the dynamics of one-dimensional strings, whose excitations are quantum particles. The theory describes the world sheet of the string and leads to a wave-equation of motion.
  • Superspace offers a different approach, where classical spacetime must be replaced with a quantum-mechanical structure.
  • Another alternative approach to quantum gravity involving the quantization of spacetime is offered by loop quantum gravity.
  • Anti-de Sitter spacetime has a boundary made up of Minkowski spacetime with one fewer spatial dimension.
  • Λ Λ Lambda\LambdaΛ CDM is the 'standard model' of modern cosmology and is supported by a great deal of experimental evidence. It might be correct.

Exercises

(49.1) Show that the wavelength of a particle with an energy equal to m P c 2 m P c 2 m_(P)c^(2)m_{\mathrm{P}} c^{2}mPc2 would be around the Planck length P P ℓ_(P)\ell_{\mathrm{P}}P. Estimate the gravitational self-energy of such a particle, as well as its Compton wavelength and Schwarzschild radius. (Ignore factors of 2 and π π pi\piπ.)
(49.2) By considering the string action, show that our string theory is invariant with respect to reparametrization.
(49.3) Verify eqn 49.9.
(49.4) Show that
(49.60) v 2 = ( X t ) 2 ( X s X t ) 2 (49.60) v 2 = X t 2 X s X t 2 {:(49.60) vec(v)_(_|_)^(2)=((del( vec(X)))/(del t))^(2)-((del( vec(X)))/(del s)*(del( vec(X)))/(del t))^(2):}\begin{equation*} \vec{v}_{\perp}^{2}=\left(\frac{\partial \vec{X}}{\partial t}\right)^{2}-\left(\frac{\partial \vec{X}}{\partial s} \cdot \frac{\partial \vec{X}}{\partial t}\right)^{2} \tag{49.60} \end{equation*}(49.60)v2=(Xt)2(XsXt)2
(49.5) (a) Starting from the Lagrangian expressed within static gauge, compute the canonical momentum of the string
(49.61) P ( t , σ ) = L ( t X ) (49.61) P ( t , σ ) = L t X {:(49.61) vec(P)(t","sigma)=(del L)/(del(del_(t)( vec(X)))):}\begin{equation*} \vec{P}(t, \sigma)=\frac{\partial L}{\partial\left(\partial_{t} \vec{X}\right)} \tag{49.61} \end{equation*}(49.61)P(t,σ)=L(tX)
(b) Use the definitions in Exercises 40.3 and 40.1, with the previous results, to compute the Hamiltonian
(49.62) H = d σ H (49.62) H = d σ H {:(49.62)H=intdsigmaH:}\begin{equation*} H=\int \mathrm{d} \sigma \mathcal{H} \tag{49.62} \end{equation*}(49.62)H=dσH
where H H H\mathcal{H}H is the Hamiltonian density.
(49.6) Verify eqn 49.35 .
(49.7) Consider a very narrow tube around a cosmic string. We will solve the Einstein equation for this situation, which we shall assume has an energy-momentum tensor with components T μ ν = T μ ν = T_(mu nu)=T_{\mu \nu}=Tμν= diag ( ρ , 0 , 0 , ρ ) diag ( ρ , 0 , 0 , ρ ) diag(rho,0,0,-rho)\operatorname{diag}(\rho, 0,0,-\rho)diag(ρ,0,0,ρ), which is designed to be proportional to the Minkowski metric in the ( t , z ) ( t , z ) (t,z)(t, z)(t,z) plane. The region is described by the line element
d s 2 = d t 2 + r 0 2 ( d θ 2 + sin 2 θ d ϕ 2 ) + d z 2 , d s 2 = d t 2 + r 0 2 d θ 2 + sin 2 θ d ϕ 2 + d z 2 , ds^(2)=-dt^(2)+r_(0)^(2)((d)theta^(2)+sin^(2)theta(d)phi^(2))+dz^(2),\mathrm{d} s^{2}=-\mathrm{d} t^{2}+r_{0}^{2}\left(\mathrm{~d} \theta^{2}+\sin ^{2} \theta \mathrm{~d} \phi^{2}\right)+\mathrm{d} z^{2},ds2=dt2+r02( dθ2+sin2θ dϕ2)+dz2,
with constant r 0 r 0 r_(0)r_{0}r0.
If we accept that this looks similar to the flat-space metric d s 2 = d t 2 + d r 2 + r 2 d ϕ 2 + d z 2 d s 2 = d t 2 + d r 2 + r 2 d ϕ 2 + d z 2 ds^(2)=-dt^(2)+dr^(2)+r^(2)dphi^(2)+dz^(2)\mathrm{d} s^{2}=-\mathrm{d} t^{2}+\mathrm{d} r^{2}+r^{2} \mathrm{~d} \phi^{2}+\mathrm{d} z^{2}ds2=dt2+dr2+r2 dϕ2+dz2, then r 0 θ r 0 θ r_(0)thetar_{0} \thetar0θ in eqn 49.63 can be treated as a sort of radial variable.
(a) Compute the components of the Ricci tensor and the Ricci scalar for this spacetime and show that Einstein's equation is satisfied.
(b) If the outer surface of the string region occurs at θ m θ m theta_(m)\theta_{\mathrm{m}}θm, compute the cross-sectional area of the string and hence its mass per unit length.
(49.8) Following from the previous problem, now consider the region outside the string. The metric outside the string can be written as
(2) d s 2 = d t 2 + r 0 2 ( cos 2 θ cos 2 θ m d θ 2 + sin 2 θ d ϕ 2 ) + d z 2 (2) d s 2 = d t 2 + r 0 2 cos 2 θ cos 2 θ m d θ 2 + sin 2 θ d ϕ 2 + d z 2 {:(2)ds^(2)=-dt^(2)+r_(0)^(2)((cos^(2)theta)/(cos^(2)theta_(m))(d)theta^(2)+sin^(2)theta(d)phi^(2))+dz^(2):}\begin{equation*} \mathrm{d} s^{2}=-\mathrm{d} t^{2}+r_{0}^{2}\left(\frac{\cos ^{2} \theta}{\cos ^{2} \theta_{m}} \mathrm{~d} \theta^{2}+\sin ^{2} \theta \mathrm{~d} \phi^{2}\right)+\mathrm{d} z^{2} \tag{2} \end{equation*}(2)ds2=dt2+r02(cos2θcos2θm dθ2+sin2θ dϕ2)+dz2
(49.11) Show that de Sitter space solves an Einstein equation with components G μ ν = ( 3 / α 2 ) g μ ν G μ ν = 3 / α 2 g μ ν G_(mu nu)=-(3//alpha^(2))g_(mu nu)G_{\mu \nu}=-\left(3 / \alpha^{2}\right) g_{\mu \nu}Gμν=(3/α2)gμν. How does this differ for anti-de Sitter space?
This has the property that it smoothly joins on to (49.12) Consider a defining equation for AdS 3 AdS 3 AdS^(3)\mathrm{AdS}^{3}AdS3 given by the interior metric from the last question at θ m θ m theta_(m)\theta_{\mathrm{m}}θm. (a) By making a suitable transformation, show that this metric describes flat spacetime in cylindrical polar coordinates.
(b) Although the spacetime seems flat, it isn't really. What is the circumference of a large circle in this spacetime with r = a r 0 r = a r 0 r^(')=a≫r_(0)r^{\prime}=a \gg r_{0}r=ar0 ?
(49.9) Consider a tetrahedron in three-dimensional space whose vertices are given by the four vectors 0 , a , b 0 , a , b vec(0), vec(a), vec(b)\overrightarrow{0}, \vec{a}, \vec{b}0,a,b, and c c vec(c)\vec{c}c. Derive expressions for the four area vectors L 1 , L 2 , L 3 L 1 , L 2 , L 3 vec(L)_(1), vec(L)_(2), vec(L)_(3)\vec{L}_{1}, \vec{L}_{2}, \vec{L}_{3}L1,L2,L3, and L 4 L 4 vec(L)_(4)\vec{L}_{4}L4, and show that they satisfy a closure property (eqn 49.42 ).
(49.10) Show that the volume of the tetrahedron in the previous exercise is given by
V = 1 6 | a × b c | V = 1 6 | a × b c | V=(1)/(6)| vec(a)xx vec(b)* vec(c)|V=\frac{1}{6}|\vec{a} \times \vec{b} \cdot \vec{c}|V=16|a×bc|
(49.65) (49.13) spacetime line element from eqn 49.56.

The Big-Bang singularity

I have shown that all the realms of the universe Are mortal, and the substance of the heavens Had birth; and I have explained most of those things That in heaven occur and must occur. Lucretius (c. 100 BC c .50 BC 100 BC c .50 BC 100BC-c.50BC100 \mathrm{BC}-\mathrm{c} .50 \mathrm{BC}100BCc.50BC ) On the Nature of Things
The Robertson-Walker spaces, and many of the Friedmann universes that we have built from them, have one particularly notable feature: an initial Big-Bang singularity. In this chapter, we ask whether this is an artefact of our theory, or whether we should regard the Big Bang as a realistic event in our Universe. We shall show that, with a minimal set of assumptions, a singularity occurring at some time in the past is indeed a realistic, and possibly even an inevitable, prospect. 1 1 ^(1){ }^{1}1
Let's construct a spacelike hypersurface S S SSS in our spacetime at some fixed value of the cosmic time. We assume that it is a special sort of hypersurface known as a Cauchy surface. Such a surface has the property that the events on it completely determine some future surface, lying in what is known as the domain of dependence of S S SSS. If this is the case then it turns out there must exist a longest-timelike curve from S S SSS to some point on the domain of dependence. 2 2 ^(2){ }^{2}2
The existence of such a longest timelike curve doesn't seem especially scandalous, but it is the subject of the argument in this chapter that reveals, in general terms, the necessity for a Big-Bang singularity to have started our Universe. Let's go to work.

50.1 Facts about Euclidean geometry

Let's first review a few useful facts about Euclidean 3 -space. (We will then apply similar ideas to curved ( 3 + 1 ) ( 3 + 1 ) (3+1)(3+1)(3+1)-dimensional space.) We shall deal with a class of geodesics that all meet points on the surface S S SSS. Although these, being geodesics, all represent the shortest distance between some point q q qqq and some specific point on the surface S S SSS, some will be shorter than others, depending on the point on S S SSS on which they end.
Consider a path γ γ gamma\gammaγ between point p p ppp and surface S S SSS. If the path is the shortest one possible between p p ppp and S S SSS then γ γ gamma\gammaγ must intersect S S SSS orthogonally. If it doesn't, as in the curve γ γ gamma\gammaγ shown in Fig. 50.1, then there is a shorter path γ γ gamma^(')\gamma^{\prime}γ that does meet the surface at right angles. We shall use the term 'orthogonal' to describe those geodesics that meet surfaces at right angles (they are also known as normal geodesics).
50.1 Facts about Euclidean geometry
1 1 ^(1){ }^{1}1 In this chapter, we follow the argument in the form given by Geroch in General Relativity, 1972 Lecture Notes (2013). The techniques sketched here (sometimes called global techniques in their generalized form) are described in detail in Penrose (1973) and also in Wald (1984) and in Hawking and Ellis (1973).
2 2 ^(2){ }^{2}2 The necessity for this longest curve existing follows from a technicality: the fact that the collection of all timelike and null curves from a point on the domain of dependence to S S SSS is compact. Compactness is discussed in Appendix C, but can be thought of here as implying that no curves go off to infinity. As a result, length is a continuous function on this space of curves, and must achieve a maximum
Fig. 50.1 Curve γ γ gamma\gammaγ does not meet the surface S S SSS at a right angle. The curve γ γ gamma^(')\gamma^{\prime}γ that does is shorter.
Fig. 50.2 The shortest point between q q qqq and the surface S S SSS involves travelling along γ γ gamma\gammaγ, rounding off the corner to avoid the crossing point at r r rrr and then following γ γ gamma^(')\gamma^{\prime}γ down to the surface S S SSS.
Fig. 50.3 The function ϕ ϕ phi\phiϕ measures the distance along the orthogonal geodesics from the surface S S SSS.
3 3 ^(3){ }^{3}3 This can be seen by writing u μ = u μ = u_(mu)=u_{\mu}=uμ= ϕ / x μ ϕ / x μ del phi//delx^(mu)\partial \phi / \partial x^{\mu}ϕ/xμ and then writing u μ ; ν u μ ; ν u_(mu;nu)u_{\mu ; \nu}uμ;ν as
( ϕ x μ ) ; ν = 2 ϕ x ν x μ Γ ν μ α ( ϕ x α ) ϕ x μ ; ν = 2 ϕ x ν x μ Γ ν μ α ϕ x α ((del phi)/(delx^(mu)))_(;nu)=(del^(2)phi)/(delx^(nu)delx^(mu))-Gamma_(nu mu)^(alpha)((del phi)/(delx^(alpha)))\left(\frac{\partial \phi}{\partial x^{\mu}}\right)_{; \nu}=\frac{\partial^{2} \phi}{\partial x^{\nu} \partial x^{\mu}}-\Gamma_{\nu \mu}^{\alpha}\left(\frac{\partial \phi}{\partial x^{\alpha}}\right)(ϕxμ);ν=2ϕxνxμΓνμα(ϕxα), which is manifestly symmetric since Γ α μ ν = Γ α ν μ Γ α μ ν = Γ α ν μ Gamma^(alpha)_(mu nu)=Gamma^(alpha)_(nu mu)\Gamma^{\alpha}{ }_{\mu \nu}=\Gamma^{\alpha}{ }_{\nu \mu}Γαμν=Γανμ.
Now consider two orthogonal geodesics γ γ gamma\gammaγ and γ γ gamma^(')\gamma^{\prime}γ that cross at a point r r rrr (Fig. 50.2). In this case, γ γ gamma\gammaγ can't be the shortest path from q q qqq to the surface S S SSS. This is because we can construct a shorter curve by rounding off the corner at the crossing point r r rrr and then following the other curve γ γ gamma^(')\gamma^{\prime}γ down to S S SSS, as is shown in Fig. 50.2. This argument can be made a little more rigorous by recalling our discussion in Chapter 8 of whether the action for a particular trajectory is a minimum or not. We saw that the existence of a conjugate point along the trajectory guarantees that it does not represent the minimum. The crossing point of the two orthogonal geodesics in this example is just such a conjugate point.
We now turn to spacetime, where these facts will be used in a slightly modified form.

50.2 Orthogonal geodesics in spacetime

In curved spacetime, recall that the longest lines are the straightest. That is, owing to the sign of the metric the most proper time elapses along the lines with least acceleration. We therefore use the results argued for Euclidean space above, but with longest replacing shortest.
Consider again a surface S S SSS and its orthogonal geodesics in spacetime. We want a way of measuring the distance from S S SSS along all of the orthogonal geodesics. To do this, we define a function ϕ ( x ) ϕ ( x ) phi(x)\phi(x)ϕ(x) that provides a measure of this distance for that geodesic that passes through point with coordinates x μ x μ x^(mu)x^{\mu}xμ, as shown in Fig. 50.3. We then define a velocity 1 -form field u ~ ( x ) u ~ ( x ) tilde(u)(x)\tilde{\boldsymbol{u}}(x)u~(x) for the orthogonal geodesics with components u μ = μ ϕ u μ = μ ϕ u_(mu)=grad_(mu)phiu_{\mu}=\boldsymbol{\nabla}_{\mu} \phiuμ=μϕ. We work in a spacetime with a metric, so we can raise indices and form a velocity vector field u ( x ) u ( x ) u(x)\boldsymbol{u}(x)u(x) with components u μ u μ u^(mu)u^{\mu}uμ. The velocity components have the usual property u μ u μ = 1 u μ u μ = 1 u^(mu)u_(mu)=-1u^{\mu} u_{\mu}=-1uμuμ=1. From the definition of u ~ ( x ) u ~ ( x ) tilde(u)(x)\tilde{\boldsymbol{u}}(x)u~(x) in terms of the scalar function ϕ ( x ) ϕ ( x ) phi(x)\phi(x)ϕ(x), the components u ν ; μ u ν ; μ u_(nu;mu)u_{\nu ; \mu}uν;μ of the covariant derivative of u ~ u ~ tilde(u)\tilde{\boldsymbol{u}}u~, are symmetric 3 3 ^(3){ }^{3}3 with respect to exchange of μ μ mu\muμ and ν ν nu\nuν and so
(50.1) ( u u ~ ) ν = u μ u ν ; μ = u μ u μ ; ν = 1 2 ( u μ u μ ) ; ν = 0 (50.1) u u ~ ν = u μ u ν ; μ = u μ u μ ; ν = 1 2 u μ u μ ; ν = 0 {:(50.1)(grad_(u)( tilde(u)))_(nu)=u^(mu)u_(nu;mu)=u^(mu)u_(mu;nu)=(1)/(2)(u^(mu)u_(mu))_(;nu)=0:}\begin{equation*} \left(\boldsymbol{\nabla}_{u} \tilde{\boldsymbol{u}}\right)_{\nu}=u^{\mu} u_{\nu ; \mu}=u^{\mu} u_{\mu ; \nu}=\frac{1}{2}\left(u^{\mu} u_{\mu}\right)_{; \nu}=0 \tag{50.1} \end{equation*}(50.1)(uu~)ν=uμuν;μ=uμuμ;ν=12(uμuμ);ν=0
since u μ u μ = 1 u μ u μ = 1 u^(mu)u_(mu)=-1u^{\mu} u_{\mu}=-1uμuμ=1. Raising the ν ν nu\nuν index on u μ u ν ; μ = 0 u μ u ν ; μ = 0 u^(mu)u_(nu;mu)=0u^{\mu} u_{\nu ; \mu}=0uμuν;μ=0 we obtain u u = u u = grad_(u)u=\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{u}=uu= 0 . This implies that the velocity vector field u ( x ) u ( x ) u(x)\boldsymbol{u}(x)u(x) is the tangent field of all of the orthogonal geodesics. We shall study the convergence of this field c c ccc, defined as minus the divergence, or
(50.2) c = u = u ; μ μ . (50.2) c = u = u ; μ μ . {:(50.2)c=-grad*u=-u_(;mu)^(mu).:}\begin{equation*} c=-\boldsymbol{\nabla} \cdot \boldsymbol{u}=-u_{; \mu}^{\mu} . \tag{50.2} \end{equation*}(50.2)c=u=u;μμ.
To make further progress, we need the result of the slightly tedious, but important, computation that follows in the next example.
Example 50.1
Consider the directional derivative of the scalar function c c ccc along our geodesics
(50.3) u c = u ν ν c = u ν u ; α ν α . (50.3) u c = u ν ν c = u ν u ; α ν α . {:(50.3)grad_(u)c=u^(nu)grad_(nu)c=-u^(nu)u_(;alpha nu)^(alpha).:}\begin{equation*} \boldsymbol{\nabla}_{u} c=u^{\nu} \nabla_{\nu} c=-u^{\nu} u_{; \alpha \nu}^{\alpha} . \tag{50.3} \end{equation*}(50.3)uc=uννc=uνu;ανα.
This can be linked to curvature by using the eqn 35.13 from Chapter 35 that says
(50.4) u ; ν μ α u ; μ ν α = R β μ ν α u β , (50.4) u ; ν μ α u ; μ ν α = R β μ ν α u β , {:(50.4)u_(;nu mu)^(alpha)-u_(;mu nu)^(alpha)=R_(beta mu nu)^(alpha)u^(beta)",":}\begin{equation*} u_{; \nu \mu}^{\alpha}-u_{; \mu \nu}^{\alpha}=R_{\beta \mu \nu}^{\alpha} u^{\beta}, \tag{50.4} \end{equation*}(50.4)u;νμαu;μνα=Rβμναuβ,
which can be massaged to read
(50.5) u ; ν α α u ; α ν α = R β α ν α u β = R β ν u β , (50.5) u ; ν α α u ; α ν α = R β α ν α u β = R β ν u β , {:(50.5)u_(;nu alpha)^(alpha)-u_(;alpha nu)^(alpha)=R_(beta alpha nu)^(alpha)u^(beta)=R_(beta nu)u^(beta)",":}\begin{equation*} u_{; \nu \alpha}^{\alpha}-u_{; \alpha \nu}^{\alpha}=R_{\beta \alpha \nu}^{\alpha} u^{\beta}=R_{\beta \nu} u^{\beta}, \tag{50.5} \end{equation*}(50.5)u;νααu;ανα=Rβαναuβ=Rβνuβ,
where the components of the Ricci tensor have appeared in the final step. We have then, that
(50.6) u c = u ν u ; ν α α + R β ν u ν u β (50.6) u c = u ν u ; ν α α + R β ν u ν u β {:(50.6)grad_(u)c=-u^(nu)u_(;nu alpha)^(alpha)+R_(beta nu)u^(nu)u^(beta):}\begin{equation*} \nabla_{u} c=-u^{\nu} u_{; \nu \alpha}^{\alpha}+R_{\beta \nu} u^{\nu} u^{\beta} \tag{50.6} \end{equation*}(50.6)uc=uνu;ναα+Rβνuνuβ
This latter expression can be rewritten in a final form
(50.7) u c = ( u ν u α i ν ) ; α + ( u ν ; α ) ( u α ; ν ) + R β ν u ν u β . (50.7) u c = u ν u α i ν ; α + u ν ; α u α ; ν + R β ν u ν u β . {:(50.7)grad_(u)c=-(u^(nu)u^(alpha)_(i nu))_(;alpha)+(u^(nu)_(;alpha))(u^(alpha)_(;nu))+R_(beta nu)u^(nu)u^(beta).:}\begin{equation*} \nabla_{u} c=-\left(u^{\nu} u^{\alpha}{ }_{i \nu}\right)_{; \alpha}+\left(u^{\nu}{ }_{; \alpha}\right)\left(u^{\alpha}{ }_{; \nu}\right)+R_{\beta \nu} u^{\nu} u^{\beta} . \tag{50.7} \end{equation*}(50.7)uc=(uνuαiν);α+(uν;α)(uα;ν)+Rβνuνuβ.
The first term on the left is zero, since u u u\boldsymbol{u}u is tangent to the geodesics, and so we have
(50.8) u c = ( u ; α ν ) ( u ; ν α ) + R β ν u ν u β . (50.8) u c = u ; α ν u ; ν α + R β ν u ν u β . {:(50.8)grad_(u)c=(u_(;alpha)^(nu))(u_(;nu)^(alpha))+R_(beta nu)u^(nu)u^(beta).:}\begin{equation*} \nabla_{u} c=\left(u_{; \alpha}^{\nu}\right)\left(u_{; \nu}^{\alpha}\right)+R_{\beta \nu} u^{\nu} u^{\beta} . \tag{50.8} \end{equation*}(50.8)uc=(u;αν)(u;να)+Rβνuνuβ.
The result from the last example can be rewritten as
(50.9) u c = ( u ν ; α ) ( u α ; ν ) + R β ν u ν u β (50.9) u c = u ν ; α u α ; ν + R β ν u ν u β {:(50.9)grad_(u)c=(u^(nu;alpha))(u_(alpha;nu))+R_(beta nu)u^(nu)u^(beta):}\begin{equation*} \boldsymbol{\nabla}_{\boldsymbol{u}} c=\left(u^{\nu ; \alpha}\right)\left(u_{\alpha ; \nu}\right)+R_{\beta \nu} u^{\nu} u^{\beta} \tag{50.9} \end{equation*}(50.9)uc=(uν;α)(uα;ν)+Rβνuνuβ
Let's pause and interpret this equation. It is telling us about the change in divergence of the world lines of the orthogonal geodesics as we move along them. This depends on two terms: the second is related to the curvature of spacetime via the Ricci tensor, which tells us about the way that curvature causes volumes to shrink. The first term is given by the quantity u α ; ν u α ; ν u_(alpha;nu)u_{\alpha ; \nu}uα;ν, which is a symmetric matrix that we must effectively multiply by itself and then trace over the result. Although this seems a little abstract, the next example shows that this quantity obeys a neat inequality, which makes it very useful.
Example 50.2
We separate out the trace and the trace-free part of the quantity u α ; ν u α ; ν u_(alpha;nu)u_{\alpha ; \nu}uα;ν by using the transverse projection operator 4 P α β = g α β + u α u β 4 P α β = g α β + u α u β ^(4)P_(alpha beta)=g_(alpha beta)+u_(alpha)u_(beta){ }^{4} P_{\alpha \beta}=g_{\alpha \beta}+u_{\alpha} u_{\beta}4Pαβ=gαβ+uαuβ. We write 5 5 ^(5){ }^{5}5
(50.10) u α ; ν = c 3 P α ν + ( u α ; ν + c 3 P α ν ) (50.10) u α ; ν = c 3 P α ν + u α ; ν + c 3 P α ν {:(50.10)u_(alpha;nu)=-(c)/(3)P_(alpha nu)+(u_(alpha;nu)+(c)/(3)P_(alpha nu)):}\begin{equation*} u_{\alpha ; \nu}=-\frac{c}{3} P_{\alpha \nu}+\left(u_{\alpha ; \nu}+\frac{c}{3} P_{\alpha \nu}\right) \tag{50.10} \end{equation*}(50.10)uα;ν=c3Pαν+(uα;ν+c3Pαν)
This decomposition allows us to say
( u ν ; α ) ( u α ; ν ) = [ c 3 P ν α + ( u ν ; α + c 3 P ν α ) ] [ c 3 P α ν + ( u α ; ν + c 3 P α ν ) ] (50.11) = c 2 3 + ( u ν ; α + c 3 P ν α ) ( u α ; ν + c 3 P α ν ) u ν ; α u α ; ν = c 3 P ν α + u ν ; α + c 3 P ν α c 3 P α ν + u α ; ν + c 3 P α ν (50.11) = c 2 3 + u ν ; α + c 3 P ν α u α ; ν + c 3 P α ν {:[(u^(nu;alpha))(u_(alpha;nu))=[-(c)/(3)P^(nu alpha)+(u^(nu;alpha)+(c)/(3)P^(nu alpha))][-(c)/(3)P_(alpha nu)+(u_(alpha;nu)+(c)/(3)P_(alpha nu))]],[(50.11)=(c^(2))/(3)+(u^(nu;alpha)+(c)/(3)P^(nu alpha))(u_(alpha;nu)+(c)/(3)P_(alpha nu))]:}\begin{align*} \left(u^{\nu ; \alpha}\right)\left(u_{\alpha ; \nu}\right) & =\left[-\frac{c}{3} P^{\nu \alpha}+\left(u^{\nu ; \alpha}+\frac{c}{3} P^{\nu \alpha}\right)\right]\left[-\frac{c}{3} P_{\alpha \nu}+\left(u_{\alpha ; \nu}+\frac{c}{3} P_{\alpha \nu}\right)\right] \\ & =\frac{c^{2}}{3}+\left(u^{\nu ; \alpha}+\frac{c}{3} P^{\nu \alpha}\right)\left(u_{\alpha ; \nu}+\frac{c}{3} P_{\alpha \nu}\right) \tag{50.11} \end{align*}(uν;α)(uα;ν)=[c3Pνα+(uν;α+c3Pνα)][c3Pαν+(uα;ν+c3Pαν)](50.11)=c23+(uν;α+c3Pνα)(uα;ν+c3Pαν)
Consider the second term on the right, which effectively instructs us to take the square of the traceless part and then trace over the result. One thing we can say about the number that results is that, if it is non-zero, it must be positive. This leads to the key inequality
(50.12) ( u ν ; α ) ( u α ; ν ) c 2 3 (50.12) u ν ; α u α ; ν c 2 3 {:(50.12)(u^(nu;alpha))(u_(alpha;nu)) >= (c^(2))/(3):}\begin{equation*} \left(u^{\nu ; \alpha}\right)\left(u_{\alpha ; \nu}\right) \geq \frac{c^{2}}{3} \tag{50.12} \end{equation*}(50.12)(uν;α)(uα;ν)c23
This is the important technical result we need.
Now consider the second term in eqn 50.9. Using a form of the Einstein equation we have, on multiplying by velocity components,
(50.13) u μ u ν R μ ν = 8 π ( T μ ν 1 2 g μ ν T ) u μ u ν (50.13) u μ u ν R μ ν = 8 π T μ ν 1 2 g μ ν T u μ u ν {:(50.13)u^(mu)u^(nu)R_(mu nu)=8pi(T_(mu nu)-(1)/(2)g_(mu nu)T)u^(mu)u^(nu):}\begin{equation*} u^{\mu} u^{\nu} R_{\mu \nu}=8 \pi\left(T_{\mu \nu}-\frac{1}{2} g_{\mu \nu} T\right) u^{\mu} u^{\nu} \tag{50.13} \end{equation*}(50.13)uμuνRμν=8π(Tμν12gμνT)uμuν
The term on the right is essentially an energy density, as seen by an observer with a tangent vector to their world line of u u u\boldsymbol{u}u. Remember from Chapter 13 that the weak energy condition says that this quantity must be positive. This means we can use eqn 50.12 to refine eqn 50.9 to read
(50.14) u c c 2 3 (50.14) u c c 2 3 {:(50.14)grad_(u)c >= (c^(2))/(3):}\begin{equation*} \nabla_{u} c \geq \frac{c^{2}}{3} \tag{50.14} \end{equation*}(50.14)ucc23
Physically, this says that gravity acts attractively, making world lines tend to converge along their length. This latter equation can also be solved. If the world lines are parametrized by a proper time τ τ tau\tauτ then u D / d τ u D / d τ grad_(u)-=D//dtau\nabla_{u} \equiv \mathrm{D} / \mathrm{d} \tauuD/dτ and we need only solve the simple differential equation
(50.15) d c d τ c 2 3 (50.15) d c d τ c 2 3 {:(50.15)(dc)/((d)tau) >= (c^(2))/(3):}\begin{equation*} \frac{\mathrm{d} c}{\mathrm{~d} \tau} \geq \frac{c^{2}}{3} \tag{50.15} \end{equation*}(50.15)dc dτc23
Taking the boundary conditions to be that, at the surface S S SSS, we have τ = 0 τ = 0 tau=0\tau=0τ=0 and also c ( τ = 0 ) = c 0 c ( τ = 0 ) = c 0 c(tau=0)=c_(0)c(\tau=0)=c_{0}c(τ=0)=c0, then the result of integrating this equation is
(50.16) c ( τ ) 3 3 / c 0 τ . (50.16) c ( τ ) 3 3 / c 0 τ . {:(50.16)c(tau) >= (3)/(3//c_(0)-tau).:}\begin{equation*} c(\tau) \geq \frac{3}{3 / c_{0}-\tau} . \tag{50.16} \end{equation*}(50.16)c(τ)33/c0τ.
The solution tells us about the effect of gravity drawing the world lines together: the convergence becomes infinite by a proper time τ = 3 / c 0 τ = 3 / c 0 tau=3//c_(0)\tau=3 / c_{0}τ=3/c0. An infinite convergence implies that the world lines have started to cross. This is not particularly surprising, since world lines are certainly allowed to cross, but it is interesting. It means the world lines form a caustic, which is an envelope around the congruence of world lines that forces them to cross at some point. This caustic is simply a consequence of the attractive nature of gravity.
So we have a situation where every orthogonal geodesic must cross another by the time it reaches an interval τ = 3 / c 0 τ = 3 / c 0 tau=3//c_(0)\tau=3 / c_{0}τ=3/c0. Recall our argument about crossing: if one orthogonal geodesic crosses another, then the first cannot be the longest timelike curve from a point beyond S S SSS. But we also know (from our other geometric fact) that the longest curve must be an orthogonal geodesic. These contradictory statements can only be reconciled by the following:
If we go farther than τ = 3 / c 0 τ = 3 / c 0 tau=3//c_(0)\tau=3 / c_{0}τ=3/c0 from S S SSS along any timelike curve to some point p p ppp, then there is no longest timelike curve from p p ppp to S S SSS.
This statement implies that sufficiently far from S S SSS, points in spacetime cannot be joined by an extremal timelike curve.
This all seems very abstract, as indeed it is, but the argument can be put straightforwardly: if we go a distance 3 / c 0 3 / c 0 3//c_(0)3 / c_{0}3/c0 from S S SSS, then we reach a point where there is no possible longest timelike curve from S S SSS. However, recall from the start of this chapter that if S S SSS is a Cauchy surface, there
must be a longest timelike curve to a future point. This contradiction has only one resolution: it must simply not be possible to get more than 3 / c 0 3 / c 0 3//c_(0)3 / c_{0}3/c0 away from S S SSS by following a timelike curve.

50.3 Our Universe

After the lengthy and abstract argument of the last section, we are ready to apply the result to our Universe, which is likely very close to one of the Robertson-Walker spaces filled with perfect fluid, and hence is welldescribed by a Friedmann model.
Consider a small comoving volume of space of 3 -volume V V V\mathcal{V}V in a Friedmann model. Dimensionally speaking, the convergence c c ccc must scale as V ˙ / V V ˙ / V -V^(˙)//V-\dot{\mathcal{V}} / \mathcal{V}V˙/V. Since the volume V V V\mathcal{V}V itself scales varies as a ( t ) 3 a ( t ) 3 a(t)^(3)a(t)^{3}a(t)3, where a ( t ) a ( t ) a(t)a(t)a(t) is the expansion factor, we conclude that the convergence of the Universe can be taken to be
(50.17) c = 3 a ˙ ( t ) a ( t ) (50.17) c = 3 a ˙ ( t ) a ( t ) {:(50.17)c=-(3(a^(˙))(t))/(a(t)):}\begin{equation*} c=-\frac{3 \dot{a}(t)}{a(t)} \tag{50.17} \end{equation*}(50.17)c=3a˙(t)a(t)
Since a ˙ ( t ) a ˙ ( t ) a^(˙)(t)\dot{a}(t)a˙(t) and a ( t ) a ( t ) a(t)a(t)a(t) are positive we see that, as a consequence of the expansion of the Universe, the world lines of dust particles in our Universe are diverging, rather than converging, as we assumed in the argument above. Rather disappointingly, therefore, we must conclude that the argument is inapplicable. Before losing hope, we note that if we play time backwards then the sign of a ˙ a ˙ a^(˙)\dot{a}a˙ flips, while a a aaa does not, and so we can apply then the argument to the past of our Universe, even if it doesn't seem to work for the future.
So now we take S S SSS to be the current spacelike hypersurface of the Universe. Following the argument above, it follows that it is not possible to trace a timelike curve backwards in time further than an interval in proper time of τ = 3 / c 0 τ = 3 / c 0 tau=3//c_(0)\tau=3 / c_{0}τ=3/c0. Why has this happened? We have defined our theory of cosmology to take place on the smooth manifold that describes the theory of fields. As a result, the only point that could be reached by following a timelike curve back in time for an interval of 3 / c 0 3 / c 0 3//c_(0)3 / c_{0}3/c0 is one that does not live in the manifold at all. This point is the singularity for which we have been searching.
The key here is that a singularity cannot conform to the demand that the manifold is smooth everywhere. It is a point that we must cut out of the spacetime if we are to describe the spacetime manifold using a field theory like relativity (Fig. 50.4). If we follow a timelike curve backwards and we reach a gap in the manifold then the curve cannot continue. This then is our initial Big-Bang singularity. In a Universe whose character is well approximated by the state of affairs that we have described in this chapter, a singular point in the past is therefore an inevitability. We conclude that the Universe starts (and this book ends) not with a whimper, but with a Big Bang.
Fig. 50.4 We are forced to cut out a singular point from a manifold, which must be a smooth space.

Chapter summary

  • A Cauchy surface S S SSS has a longest timelike curve from S S SSS to a point p p ppp on the domain of dependence.
  • If a point p p ppp is further from S S SSS than τ = 3 / c 0 τ = 3 / c 0 tau=3//c_(0)\tau=3 / c_{0}τ=3/c0 along any timelike curve then there is no timelike curve from S S SSS to p p ppp.
  • A singularity in the spacetime manifold is the reason we cannot follow a timelike curve further back than τ = 3 / c 0 τ = 3 / c 0 tau=3//c_(0)\tau=3 / c_{0}τ=3/c0 in a RobertsonWalker Universe.
  • An initial, Big-Bang singularity is therefore expected for a Robertson-Walker Universe on general grounds.

Exercises

(50.1) The ( 0 , 2 ) ( 0 , 2 ) (0,2)(0,2)(0,2) projection tensor is given by
(50.18) P ( , ) = g ( , ) + u ~ ( ) u ~ ( ) (50.18) P ( , ) = g ( , ) + u ~ ( ) u ~ ( ) {:(50.18)P(",")=g(",")+ tilde(u)()ox tilde(u)():}\begin{equation*} \boldsymbol{P}(,)=\boldsymbol{g}(,)+\tilde{\boldsymbol{u}}() \otimes \tilde{\boldsymbol{u}}() \tag{50.18} \end{equation*}(50.18)P(,)=g(,)+u~()u~()
where u u u\boldsymbol{u}u is a velocity vector.
(a) Write the tensor in component form.
(b) Show that if we insert a vector v v v\boldsymbol{v}v into P P P\boldsymbol{P}P we project v v vvv into the 3 -surface that is orthogonal to u u u\boldsymbol{u}u.
(c) Evaluate P μ ν P μ ν P μ ν P μ ν P^(mu nu)P_(mu nu)P^{\mu \nu} P_{\mu \nu}PμνPμν.
(d) Evaluate the P α β u α ; β P α β u α ; β P^(alpha beta)u_(alpha;beta)P^{\alpha \beta} u_{\alpha ; \beta}Pαβuα;β for the case that u u u\boldsymbol{u}u is tangent to a geodesic.
(e) If n n n\boldsymbol{n}n is a unit spacelike vector, show that P = g ( ) , n ~ ( ) n ~ ( ) P = g ( ) , n ~ ( ) n ~ ( ) P=g()-, tilde(n)()ox tilde(n)()\boldsymbol{P}=\boldsymbol{g}()-,\tilde{\boldsymbol{n}}() \otimes \tilde{\boldsymbol{n}}()P=g(),n~()n~() is the corresponding projection operator.
(50.2) The conventional derivation of the Raychaudhuri equation uses a very similar argument to the one in Section 50.2. We follow the approach of Zee here, which can be referred to for further details. Consider a congruence of timelike geodesics, parametrized by proper time τ τ tau\tauτ, with coordinates x μ ( τ , σ 1 , σ 2 , σ 3 ) x μ τ , σ 1 , σ 2 , σ 3 x^(mu)(tau,sigma^(1),sigma^(2),sigma^(3))x^{\mu}\left(\tau, \sigma^{1}, \sigma^{2}, \sigma^{3}\right)xμ(τ,σ1,σ2,σ3) and tangent vectors u = u = u=\boldsymbol{u}=u= ( x μ / τ ) e μ x μ / τ e μ (delx^(mu)//del tau)e_(mu)\left(\partial x^{\mu} / \partial \tau\right) \boldsymbol{e}_{\mu}(xμ/τ)eμ. The vectors W i = ( x μ / σ i ) e μ W i = x μ / σ i e μ W_(i)=(delx^(mu)//delsigma^(i))e_(mu)\boldsymbol{W}_{i}=\left(\partial x^{\mu} / \partial \sigma^{i}\right) \boldsymbol{e}_{\mu}Wi=(xμ/σi)eμ span a three-dimensional subspace which can be projected into using the operator P P P\boldsymbol{P}P from the previous question.
(a) Show that the rate of change of W i W i W_(i)\boldsymbol{W}_{i}Wi along the congruence is given by
(50.19) ( u W i ) μ = B ν μ W i ν (50.19) u W i μ = B ν μ W i ν {:(50.19)(grad_(u)W_(i))^(mu)=B_(nu)^(mu)W_(i)^(nu):}\begin{equation*} \left(\boldsymbol{\nabla}_{u} \boldsymbol{W}_{i}\right)^{\mu}=B_{\nu}^{\mu} W_{i}^{\nu} \tag{50.19} \end{equation*}(50.19)(uWi)μ=BνμWiν
where B μ ν = u μ ; ν B μ ν = u μ ; ν B_(mu nu)=u_(mu;nu)B_{\mu \nu}=u_{\mu ; \nu}Bμν=uμ;ν. The tensor B B B\boldsymbol{B}B therefore measures the failure of the vector W i W i W_(i)\boldsymbol{W}_{i}Wi to be parallel transported along the congruence.
(b) Show that B B B\boldsymbol{B}B has the properties that u μ B μ ν = u μ B μ ν = u^(mu)B_(mu nu)=u^{\mu} B_{\mu \nu}=uμBμν= 0 and B μ ν u ν = 0 B μ ν u ν = 0 B_(mu nu)u^(nu)=0B_{\mu \nu} u^{\nu}=0Bμνuν=0, telling us that B B B\boldsymbol{B}B also lives in the same three-space as W i W i W_(i)\boldsymbol{W}_{i}Wi.
The tensor B B B\boldsymbol{B}B is conventionally split into: (i) a trace
(50.20) θ = P μ ν B μ ν = u ; μ μ , (50.20) θ = P μ ν B μ ν = u ; μ μ , {:(50.20)theta=P^(mu nu)B_(mu nu)=u_(;mu)^(mu)",":}\begin{equation*} \theta=P^{\mu \nu} B_{\mu \nu}=u_{; \mu}^{\mu}, \tag{50.20} \end{equation*}(50.20)θ=PμνBμν=u;μμ,
which describes expansion; (ii) a symmetric, traceless part
(50.21) σ μ ν = 1 2 ( B μ ν + B ν μ ) 1 3 θ P μ ν (50.21) σ μ ν = 1 2 B μ ν + B ν μ 1 3 θ P μ ν {:(50.21)sigma_(mu nu)=(1)/(2)(B_(mu nu)+B_(nu mu))-(1)/(3)thetaP_(mu nu):}\begin{equation*} \sigma_{\mu \nu}=\frac{1}{2}\left(B_{\mu \nu}+B_{\nu \mu}\right)-\frac{1}{3} \theta P_{\mu \nu} \tag{50.21} \end{equation*}(50.21)σμν=12(Bμν+Bνμ)13θPμν
which describes shear, and (iii) an antisymmetric part
(50.22) ω μ ν = 1 2 ( B μ ν B ν μ ) (50.22) ω μ ν = 1 2 B μ ν B ν μ {:(50.22)omega_(mu nu)=(1)/(2)(B_(mu nu)-B_(nu mu)):}\begin{equation*} \omega_{\mu \nu}=\frac{1}{2}\left(B_{\mu \nu}-B_{\nu \mu}\right) \tag{50.22} \end{equation*}(50.22)ωμν=12(BμνBνμ)
which describes rotation. We therefore have
(50.23) B μ ν = σ μ ν + 1 3 θ P μ ν + ω μ ν (50.23) B μ ν = σ μ ν + 1 3 θ P μ ν + ω μ ν {:(50.23)B_(mu nu)=sigma_(mu nu)+(1)/(3)thetaP_(mu nu)+omega_(mu nu):}\begin{equation*} B_{\mu \nu}=\sigma_{\mu \nu}+\frac{1}{3} \theta P_{\mu \nu}+\omega_{\mu \nu} \tag{50.23} \end{equation*}(50.23)Bμν=σμν+13θPμν+ωμν
(c) Show that
( u B ) μ ν = B μ α B ν α R β μ α ν u β u α , (50.24) u B μ ν = B μ α B ν α R β μ α ν u β u α ,  (50.24)  (grad_(u)B)_(mu nu)=-B_(mu alpha)B_(nu)^(alpha)-R_(beta mu alpha nu)u^(beta)u^(alpha),quad" (50.24) "\left(\nabla_{u} \boldsymbol{B}\right)_{\mu \nu}=-B_{\mu \alpha} B_{\nu}^{\alpha}-R_{\beta \mu \alpha \nu} u^{\beta} u^{\alpha}, \quad \text { (50.24) }(uB)μν=BμαBναRβμανuβuα, (50.24) 
which is known as the Raychaudhuri equation and is very useful in proving singularity theorems. Often the expansion parameter θ θ theta\thetaθ is useful to tell us if a congruence is expanding or contracting. Since D θ / d τ = g μ ν ( u B ) μ ν D θ / d τ = g μ ν u B μ ν Dtheta//dtau=g^(mu nu)(grad_(u)B)_(mu nu)\mathrm{D} \theta / \mathrm{d} \tau=g^{\mu \nu}\left(\boldsymbol{\nabla}_{\mathbf{u}} \boldsymbol{B}\right)_{\mu \nu}Dθ/dτ=gμν(uB)μν, we can contract the result
from (c) using g μ ν g μ ν g^(mu nu)g^{\mu \nu}gμν.
(d) Show that
(50.25) g μ ν B μ α B ν α = σ μ ν σ μ ν + 1 3 θ 2 ω μ ν ω μ ν (50.25) g μ ν B μ α B ν α = σ μ ν σ μ ν + 1 3 θ 2 ω μ ν ω μ ν {:(50.25)g^(mu nu)B_(mu alpha)B_(nu)^(alpha)=sigma_(mu nu)sigma^(mu nu)+(1)/(3)theta^(2)-omega_(mu nu)omega^(mu nu):}\begin{equation*} g^{\mu \nu} B_{\mu \alpha} B_{\nu}^{\alpha}=\sigma_{\mu \nu} \sigma^{\mu \nu}+\frac{1}{3} \theta^{2}-\omega_{\mu \nu} \omega^{\mu \nu} \tag{50.25} \end{equation*}(50.25)gμνBμαBνα=σμνσμν+13θ2ωμνωμν
(e) Use the previous result to obtain
D θ d τ = 1 3 θ 2 σ μ ν σ μ ν + ω μ ν ω μ ν R μ ν u μ u ν D θ d τ = 1 3 θ 2 σ μ ν σ μ ν + ω μ ν ω μ ν R μ ν u μ u ν (Dtheta)/((d)tau)=-(1)/(3)*theta^(2)-sigma_(mu nu)sigma^(mu nu)+omega_(mu nu)omega^(mu nu)-R_(mu nu)u^(mu)u^(nu)\frac{\mathrm{D} \theta}{\mathrm{~d} \tau}=-\frac{1}{3} \cdot \theta^{2}-\sigma_{\mu \nu} \sigma^{\mu \nu}+\omega_{\mu \nu} \omega^{\mu \nu}-R_{\mu \nu} u^{\mu} u^{\nu}Dθ dτ=13θ2σμνσμν+ωμνωμνRμνuμuν
(50.3) In the case that ω μ ν = 0 ω μ ν = 0 omega_(mu nu)=0\omega_{\mu \nu}=0ωμν=0, the tensor σ μ ν σ μ ν sigma_(mu nu)\sigma_{\mu \nu}σμν is purely spatial, implying
(50.27) D θ d τ 1 3 θ 2 R μ ν u μ u ν (50.27) D θ d τ 1 3 θ 2 R μ ν u μ u ν {:(50.27)(Dtheta)/((d)tau) <= -(1)/(3)*theta^(2)-R_(mu nu)u^(mu)u^(nu):}\begin{equation*} \frac{\mathrm{D} \theta}{\mathrm{~d} \tau} \leq-\frac{1}{3} \cdot \theta^{2}-R_{\mu \nu} u^{\mu} u^{\nu} \tag{50.27} \end{equation*}(50.27)Dθ dτ13θ2Rμνuμuν
Show that if we adopt the strong energy condition
(50.28) T μ ν v μ v ν 1 2 T v μ v μ (50.28) T μ ν v μ v ν 1 2 T v μ v μ {:(50.28)T_(mu nu)v^(mu)v^(nu) >= (1)/(2)Tv^(mu)v_(mu):}\begin{equation*} T_{\mu \nu} v^{\mu} v^{\nu} \geq \frac{1}{2} T v^{\mu} v_{\mu} \tag{50.28} \end{equation*}(50.28)Tμνvμvν12Tvμvμ
for all timelike vectors v v v\boldsymbol{v}v, then all of the geodesics approach each other, i.e. gravitation has a focussing effect on the congruence.

A
Further reading

Books must follow sciences, and not sciences books
Francis Bacon (1561-1626)
In my situation as Chancellor of the University of Oxford, I
have been much exposed to authors
Arthur Wellesley, Duke of Wellington (1769-1852)
There are many excellent books on general relativity, cosmology, geometry and related fields. 1 1 ^(1){ }^{1}1 Like many introductory books this one contains a compilation of many arguments, explanations and examples formulated and presented by other authors. Our sources are discussed at the end of this appendix.
In learning most subjects, one usually benefits from having read 2 2 >= 2\geq 22 books and many of those mentioned in this chapter would provide a good supplement to this one. We like Geroch and Spivak's insightful explanations, Penrose (2004) and Hartle's approachability and Wald's precision. For learning general relativity, some recommended choices that share a similar approach to this book include: (i) Geroch (1978), Penrose (2004), Schutz (1985) and Hartle at an introductory level; (ii) d'Inverno, Guidry, Hobson/Efstathiou/Lasenby, and Zee at an intermediate level; and (iii) Geroch (2013, General Relativity), Hawking/Ellis, Landau/Lifshitz (1975), Misner/Thorne/Wheeler and Wald at an advanced level. For learning differential geometry we recommend: (i) Penrose (2004) and Schutz (1980) at an introductory level; Misner/Thorne/Wheeler at an intermediate level; and (iii) Geroch (1985 and 2013, Differential Geometry) and Spivak (2005) at a more advanced level (with the latter very suitable for readers with a background in mathematics). The problem books by Moore (at an elementary level) and by Lightman et al. and Blennow/Ohlsson (intermediate/advanced) are also warmly recommended. We have followed their approaches in some examples and exercises.
One thing to beware of in books on general relativity is the different sign conventions adopted. These change a number of the key equations. There are four conventions to watch out for (three of which are independent). We list the conventions of several books below.
  • The sign s 1 s 1 s_(1)s_{1}s1 in front of the line element of the metric 2 g 2 g ^(2)g{ }^{2} g2g
(A.1) s 1 d s 2 = d x 0 + d x 1 + d x 2 + d x 3 (A.1) s 1 d s 2 = d x 0 + d x 1 + d x 2 + d x 3 {:(A.1)s_(1)ds^(2)=-dx^(0)+dx^(1)+dx^(2)+dx^(3):}\begin{equation*} s_{1} \mathrm{~d} s^{2}=-\mathrm{d} x^{0}+\mathrm{d} x^{1}+\mathrm{d} x^{2}+\mathrm{d} x^{3} \tag{A.1} \end{equation*}(A.1)s1 ds2=dx0+dx1+dx2+dx3
  • The sign s 2 s 2 s_(2)s_{2}s2 in front of the components of the Riemann tensor R R R\boldsymbol{R}R
s 2 R ν α β μ = α Γ μ β ν β Γ α ν μ + Γ α σ μ Γ β ν σ Γ β σ μ β Γ α ν σ , (A.2) s 2 R ( u , v ) = u v v u [ u , v ] . s 2 R ν α β μ = α Γ μ β ν β Γ α ν μ + Γ α σ μ Γ β ν σ Γ β σ μ β Γ α ν σ , (A.2) s 2 R ( u , v ) = u v v u [ u , v ] . {:[s_(2)R_(nu alpha beta)^(mu)=del_(alpha)Gamma^(mu)_(beta nu)-del_(beta)Gamma_(alpha nu)^(mu)+Gamma_(alpha sigma)^(mu)Gamma_(beta nu)^(sigma)-Gamma_(beta sigma)^(mu)_(beta)Gamma_(alpha nu)^(sigma)","],[(A.2)s_(2)R(u","v)=grad_(u)grad_(v)-grad_(v)grad_(u)-grad_([u,v]).]:}\begin{align*} s_{2} R_{\nu \alpha \beta}^{\mu} & =\partial_{\alpha} \Gamma^{\mu}{ }_{\beta \nu}-\partial_{\beta} \Gamma_{\alpha \nu}^{\mu}+\Gamma_{\alpha \sigma}^{\mu} \Gamma_{\beta \nu}^{\sigma}-\Gamma_{\beta \sigma}^{\mu}{ }_{\beta} \Gamma_{\alpha \nu}^{\sigma}, \\ s_{2} \boldsymbol{R}(\boldsymbol{u}, \boldsymbol{v}) & =\nabla_{\boldsymbol{u}} \nabla_{\boldsymbol{v}}-\nabla_{\boldsymbol{v}} \nabla_{\boldsymbol{u}}-\nabla_{[\boldsymbol{u}, \boldsymbol{v}]} . \tag{A.2} \end{align*}s2Rναβμ=αΓμβνβΓανμ+ΓασμΓβνσΓβσμβΓανσ,(A.2)s2R(u,v)=uvvu[u,v].
  • The sign s 3 sign s 3 signs_(3)\operatorname{sign} s_{3}signs3 in the Einstein equation
G = s 3 8 π T (A.3) G μ ν = R μ ν 1 2 g μ ν R = s 3 8 π T μ ν G = s 3 8 π T (A.3) G μ ν = R μ ν 1 2 g μ ν R = s 3 8 π T μ ν {:[G=s_(3)8pi T],[(A.3)G_(mu nu)=R_(mu nu)-(1)/(2)g_(mu nu)R=s_(3)8piT_(mu nu)]:}\begin{align*} \boldsymbol{G} & =s_{3} 8 \pi \boldsymbol{T} \\ G_{\mu \nu}=R_{\mu \nu}-\frac{1}{2} g_{\mu \nu} R & =s_{3} 8 \pi T_{\mu \nu} \tag{A.3} \end{align*}G=s38πT(A.3)Gμν=Rμν12gμνR=s38πTμν
  • The sign s 4 = s 3 / s 2 s 4 = s 3 / s 2 s_(4)=s_(3)//s_(2)s_{4}=s_{3} / s_{2}s4=s3/s2 in the contraction
(A.4) s 4 R μ ν = R μ α ν α (A.4) s 4 R μ ν = R μ α ν α {:(A.4)s_(4)R_(mu nu)=R_(mu alpha nu)^(alpha):}\begin{equation*} s_{4} R_{\mu \nu}=R_{\mu \alpha \nu}^{\alpha} \tag{A.4} \end{equation*}(A.4)s4Rμν=Rμανα
Here are some of the sign conventions used in well-known books. 3 3 ^(3){ }^{3}3
3 3 ^(3){ }^{3}3 References are given at the end of this appendix.

Further reading by chapter:

Most of the topics covered in this book are also discussed in the standard references on general relativity. The further reading list given below is based on books that use a similar approach to us, and those whose presentation we've followed in some of our arguments. Some are at an introductory and some at a more advanced level.
Chapter 1: an accessible introduction to special relativity can be found in French and in Geroch (1978); a useful summary is given in Landau and Lifshitz (vol. II). Links between geometry and special relativity are discussed in Ellis/Williams. Chapter 2: vectors are discussed in Boas and in Penrose (2004); their use in relativity is covered in Hartle and in Schutz (1985). Chapter 3: coordinate transformations are introduced in French and in Schutz (1985). Chapter 4: 1-forms are introduced in Schutz (1980 and 1985), Misner/Thorne/Wheeler and in Guidry. Ludvigsen gives a geometrical introduction to the energy-momentum tensor. Chapter 5: metrics are introduced in Hartle and Zee. Chapter 6: the principles of relativity are discussed in all books on relativity, and in most detail by Weinberg (1972). A historical account can be found in Pais. See Einstein for a collection of the original papers, these are put in context by Cheng. Chapter % % %\%% : the covariant derivative and connection coefficients are discussed by Schutz (1985) and in Misner/Thorne/Wheeler. Chapters 8 and 9: the method used to extract connection coefficients can be found in Zee and in Hartle. Chapter 10: the importance of the vielbein is stressed in Hartle, whose approach and notation we follow. They are used extensively in Lightman et al. Chapter 11: an introduction to Riemann curvature is found in Hartle and in Misner/Thorne/Wheeler. Chapter 12: an intuitive introduction to the energy-momentum tensor is given in Hartle. Misner/Thorne/Wheeler provides lots of insight and useful diagrams. Chapter 13: the construction of the Einstein equation is justified in Schutz (1985). See Feynman (1995) for a rather different approach. Einstein's route to the field equation is described is Pais. Chapter 15: Cosmology is introduced in Lambourne. For a full account see Peacock and Weinberg (2006). Chapter 16: Robertson Walker spaces are introduced in Lambourne and described in detail in Misner/Thorne/Wheeler. More detail on hyperbolic spaces is covered in Penrose (2004) and in Needham (1997). Chapters 17 and 18: cosmological models are introduced in Penrose (2004) and, more systematically, in Lambourne. Chapter 19: conformal infinities and singularities are outlined in d'Inverno. Singularity theory is described at a more advanced level in Hawking/Ellis (whose presentation we follow in a highly simplified form) and in Penrose (1972). Chapter 20: Newtonian orbits are analysed in French and Ebbison. An advanced (but fascinating!) take is Gutzwiller. Chapter 21: the Schwarzschild geometry is introduced in Hartle, in Schutz (1985) and in Lambourne. Misner/Thorne/Wheeler gives a complete account. Chapters 22, 23 and 24: motion in the Schwarzschild geometry is discussed in Hartle (whose approach and notation we follow), in Moore and in Misner/Thorne/Wheeler. Chapter 25:
black holes are introduced in Blundell, and treated in all modern general relativity texts. See Hartle and Schutz (1985) for introductory treatments and Misner/Thorne/Wheeler for a more advanced discussion. Chandrasekhar gives a complete account (including a clear discussion of much of the material in this part of the book), albeit at a very advanced level. Chapters 26 and 27: black hole singularities are clearly explained in Wald, and we follow this approach. The analogy with accelerating Minkowski coordinates is discussed in Rindler. Wormholes are discussed in Misner/Thorne/Wheeler. Chapter 28: Hawking radiation is explained in Schutz (1985) and in Zee. For black hole thermodynamics, see Page (2005) and Carlip (2014). Chapter 29: charged and rotating black holes are introduced in Hartle. Our discussion of the Kerr metric follows Schutz (1985). Hawking/Ellis supplies additional insight. Chapter 30: classical curvature is introduced from a visual perspective in the wonderful book by Needham (2021), in Zee and (from a historical perspective) in Weinberg (1972). A full account is given in Lipschutz. Spivak (1999) gives translations of the key papers by Gauss and Riemann, along with Spivak's characteristically insightful commentary. Chapters 31, 32 and 37: modern geometry is introduced in Needham (2021), in Misner/Thorne/Wheeler and in Schutz (1980). We follow Misner/Thorne/Wheeler's presentation and notation in this part of the book. An introduction to the formal mathematics underlying this subject is given in Spivak (1971). See Spivak (2005) for the full story on all of the topics in this section. Chapter 33: an accessible introduction to the Lie derivative is found in Penrose (2004). The discussion in Schutz (1980) is also very accessible at an intermediate level. Chapters 34 and 35: the geometrical approach to the covariant derivative and the Riemann tensor is discussed in Penrose (2004) and Needham (2021) at an introductory level, and in Misner/Thorne/Wheeler at an advanced level. Spivak (2005) fills in the mathematical details. Hawking/Ellis provides lots of insight. Chapter 36: Cartan's method is explained in Needham (2021), and in more detail (with examples) in Misner/Thorne/Wheeler and in Nakahara. Some more applications can be found in Lightman et al. Chapter 38: chains are introduced very clearly in Ryder (1985). For the full story see Spivak (1971 and 2005). Chapter 39: a full account of fluid mechanics is given in Landau/Lifshitz (vol. VI). A introduction can be found in Feynman/Leighton/Sands (vol. II) and in Thorne/Blandford. Chapter 40: quantum field theory is described in Lancaster/Blundell. Some advanced topics are covered in Wald and in Padmanabhan. Chapter 41: inflation is discussed in Peacock. A more advanced discussion can be found in Padmanabhan. Fine tuning is considered in Lewis and Barnes. Chapter 42: the geometrical interpretation of electromagnetism is well described in Misner/Thorne/Wheeler. It's treatment as a field theory is discussed in Lancaster/Blundell. Chapter 43: the geometric view of the Bianchi identity is covered at an introductory level in Ryder (1985). See Misner/Thorne/Wheeler, whose approach we follow, for the full story. Chapter 44: gauge theory is discussed in Lancaster/Blundell and in Ryder (1985), whose approach we
follow. A nice introduction is given in Penrose (2004). Chapter 45: the weak-field limit is discussed at an introductory level in Schutz (1985) and in more detail in Misner/Thorne/Wheeler. Feynman (1995) has a characteristically interesting take, as does Geroch (2013, General Relativity). We follow Ryder's (2009) discussion of the Lense-Thirring effect in the problems. Chapter 46: gravitational waves are introduced clearly in Schutz (1985), whose approach we follow. A complete and modern treatment can be found in Thorne/Blandford. Chapter 47: the properties of gravitons in a quantum field theory are discussed in similar terms in Feynman (1995). Chapter 48: Kaluza-Klein theory is introduced in Zee, whose discussion we follow. Chapter 49: string theory is introduced in Zwiebach. Loop quantum gravity is described in Rovelli/Vidotto. A short history of the latter field is given in the review by Ashketar. We follow Zee's discussion of particles in the AdS spacetime. Chapter 50: the argument we discuss is given in more detail in Geroch (2013, General Relativity). More detail on the methods can be found in Penrose (1973), Wald and also in Hawking/Ellis. Appendix C: good books on topological spaces include all of the lecture note volumes by Geroch (his course on Topology is the simplest), with more mathematical treatments available in Nakahara and in the book by Nash and Sen. See Penrose (2004) for a basic introduction to this material and Spivak for the full story. Geroch's book Mathematical Physics takes things further for the physicist. Appendix D: we follow Zee and Hartle's very clear discussions of embedding.

Bibliography

  • V. I. Arnold, Mathematical Methods of Classical Mechanics, 2nd edition, Springer, New York (1989).
  • A. Ashketar, Quantum Gravity, arXiv:gr-qc/0410054v2 (2004).
  • M. Blennow and T. Ohlsson, 300 Problems in Special and General Relativity, CUP, Cambridge (2022).
  • K. M. Blundell, Black Holes, a Very Short Introduction, OUP, Oxford (2015).
  • M. L. Boas, Mathematical Methods in the Physical Sciences, 2nd edition, Wiley, New York (1983).
  • C. G. Böhmer, Introduction to General Relativity and Cosmology, World Scientific, London (2016).
  • H. R. Brown, Physical Relativity, OUP, Oxford (2006).
  • S. Carlip, Int. J. Mod. Phys. D 23, 1430023 (2014) [arXiv:1410.1486].
  • S. Carlip, General Relativity, a Concise Introduction, OUP, Oxford (2019).
  • S. Carroll, Spacetime and Geometry: An Introduction to General Relativity, CUP, Cambridge (2019).
  • S. Chandrasekhar, The Mathematical Theory of Black Holes, OUP, Oxford (1992).
  • T.-P. Cheng, Einstein's Physics, OUP, Oxford (2013).
  • Y. Choquet-Bruhat, C. DeWitt-Morette, and M. Dillard-Bleick, Analysis, Manifolds and Physics, North-Holland, Amsterdam (1977).
  • Y. Choquet-Bruhat, Introduction to General Relativity, Black Holes and Cosmology, OUP, Oxford (2015).
  • S. Coleman, Sidney Coleman's Lectures on Relativity, CUP, Cambridge, (2022).
  • R. d'Inverno, Introduction to Einstein's Relativity, OUP, Oxford (1992).
  • A. Einstein, The Principle of Relativity, Dover, New York (1952).
  • G. F. R. Ellis and R. M. Williams, Flat and Curved Space-Times, (2nd edition), OUP, Oxford (2000).
  • R. P. Feynman, Feynman Lectures on Gravitation, Penguin, London (1995).
  • R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on Physics, Vol. II, Pearson Addison Wesley, San Francisco (2006)
  • J. Foster and D. J. Nightingale, A Short Course in General Relativity, 3rd edition, Springer, New York (2010).
  • T. Frankel, The Geometry of Physics, 2nd edition, CUP, Cambridge (2004).
  • A. P. French, Special Relativity, Chapman and Hall, London (1968).
  • A. P. French and M. G. Ebbison, Introduction to Classical Mechanics, Chapman and Hall, London (1986)
  • R. Geroch, General Relativity from A A AAA to B B BBB, University of Chicago Press, Chicago (1978).
  • R. Geroch, Differential Geometry, 1972 Lecture Notes, Minkowski Institute Press, Montreal (2013)
  • R. Geroch, General Relativity, 1972 Lecture Notes, Minkowski Institute Press, Montreal (2013).
  • R. Geroch, Geometrical Quantum Mechanics, 1974 Lecture Notes, Minkowski Institute Press Montreal (2013).
  • R. Geroch, Topology, 1978 Lecture Notes, Minkowski Institute Press, Montreal (2013).
  • R. Geroch, Mathematical Physics, Chicago University Press, Chicago (1985).
  • N. Gray, A Student's Guide to General Relativity, CUP, Cambridge (2019).
  • O/\emptyset. Grøn and S. Hervik, Einstein's General Theory of Relativity, Springer, New York (2007).
  • M. Guidry, Modern General Relativity, CUP, Cambridge (2019).
  • M. C. Gutzwiller, Chaos in Classical and Quantum Mechanics, SpringerVerlag, New York (1990).
  • J. B. Hartle, Gravity: an Introduction to Einstein's General Relativity, Pearson, Harlow (2014)
  • S. W. Hawking and G. F. R. Ellis, The Large Scale Structure of Space-time, CUP, Cambridge (1973).
  • M. P. Hobson, G. Efstathiou, and A. N. Lasenby, General Relativity, CUP, Cambridge (2006).
  • L. P. Hughston and K. P. Tod, An Introduction to General Relativity, CUP, Cambridge (1990).
  • R. J. A. Lambourne, Relativity, Gravitation and Cosmology, CUP, Cambridge (2010).
  • T. Lancaster and S. J. Blundell, Quantum Field Theory for the Gifted Amateur, OUP, Oxford (2014)
  • L. D. Landau and E. M. Lifshitz, Mechanics (volume I of Landau and Lifshitz), Pergamon, Oxford (1976).
  • L. D. Landau and E. M. Lifshitz, Classical Theory of Fields (volume II of Landau and Lifshitz), Pergamon, Oxford (1975).
    L. D. Landau and E. M. Lifshitz, Fluid Mechanics (volume VI of Landau and Lifshitz), Pergamon, Oxford (1987).
  • G. F. Lewis and L. A. Barnes, A Fortunate Universe, Cambridge University Press, Cambridge (2016).
  • A. P. Lightman, W. H. Press, R. H. Price, and S. A. Teukolsky, Problem Book in Relativity and Gravitation, Princeton University Press, Princeton (1975).
  • S. Lipschutz, Schaum's Outline of Differential Geometry, McGraw-Hill, New York (1969).
  • M. Ludvigsen, General Relativity, CUP, Cambridge (1999).
  • M. Maggiore, Gravitational Waves, OUP, Oxford (2007).
  • C. W. Misner, K. S. Thorne, and J. A. Wheeler, Gravitation, W. H. Freeman and company, New York (1973).
  • T. A. Moore, A General Relativity Workbook, University Science Books, Mill Valley, CA (2013).
  • V. F. Mukhanov and S. Winitzki, Introduction to Quantum Effects in Gravity, CUP, Cambridge (2007).
  • C. Nash and S. Sen Topology and Geometry for Physicists, Dover, New York (1983).
  • M. Nakahara, Geometry, Topology and Physics, Adam Hilger, Bristol (1990).
  • H. Năstase, String theory methods for condensed matter physics, CUP, Cambridge (2017).
  • T. Needham, Visual Complex Analysis, OUP, Oxford (1997).
  • T. Needham, Visual Differential Geometry and Forms, Princeton University Press, Princeton (2021).
  • H. C. Ohanian and R. Ruffini, Gravitation and Spacetime, 3rd edition, CUP, Cambridge (2013).
  • T. Padmanabhan, Gravitation: Foundations and Frontiers, CUP, Cambridge (2010).
  • D. N. Page, New. J. Phys. 7, 203 (2005).
  • A. Pais, Subtle Is the Lord: The Science and the Life of Albert Einstein, OUP, Oxford (2005).
  • J. A. Peacock, Cosmological Physics, CUP, Cambridge (1999).
  • R. Penrose, Techniques of Differential Topology in Relativity, SIAM, Philadelphia (1973).
  • R. Penrose, The Road to Reality, Vintage, London (2004).
  • J. Plebański and A. Krasiński, An Introduction to General Relativity and Cosmology, CUP, Cambridge (2006).
  • E. Poisson, A Relativist's Toolkit, CUP, Cambridge (2004).
  • E. Poisson and C. M. Will, Gravity, CUP, Cambridge (2014).
  • W. Rindler, Relativity, OUP, Oxford (2006).
  • C. Rovelli, General Relativity: The Essentials, CUP, Cambridge (2021).
  • C. Rovelli and F. Vidotto, Covariant Loop Quantum Gravity, CUP, Cambridge (2015).
  • L. H. Ryder, Quantum Field Theory, CUP, Cambridge (1985).
  • L. H. Ryder, Introduction to General Relativity, CUP, Cambridge (2009).
  • D. W. Sciama, The Physical Foundations of General Relativity, Doubleday & Co., New York (1969).
  • B. F. Schutz, A First Course in General Relativity, CUP, Cambridge (1985).
  • B. F. Schutz, Geometrical Methods of Mathematical Physics, CUP, Cambridge (1980).
  • M. Spivak, Calculus on Manifolds, Westview Press, Boulder (1971).
  • M. Spivak, A Comprehensive Introduction to Differential Geometry: Vol 1, 3rd Edition, Publish or Perish, Houston (2005)
  • M. Spivak, A Comprehensive Introduction to Differential Geometry: Vol 2, 3rd Edition, Publish or Perish, Houston (1999).
  • J. L. Synge, Relativity: The General Theory, North-Holland, New York (1960).
  • E. F. Taylor, J. A. Wheeler, and E. Bertshinger, Exploring Black Holes, 2nd Edition, available for free download from eftaylor.com/exploringblackholes (2017).
  • K. S. Thorne and R. D. Blandford, Modern Classical Physics, Princeton University Press, Princeton (2017).
  • R. M. Wald, General Relativity, University of Chicago Press, Chicago (1984).
  • S. Weinberg, Gravitation and Cosmology, Wiley, New York (1972).
  • S. Weinberg, Cosmology, CUP, Cambridge (2008).
  • A. Zee, Einstein Gravity in a Nutshell, Princeton University Press, Princeton (2013).
  • B. Zwiebach, A First Course in String Theory, 2nd edition., CUP, Cambridge (2009).

B

Conventions and notation

B. 1 Electromagnetic units 562
B. 3 Covariant derivatives 564
1 1 ^(1){ }^{1}1 Almost all books on classical and quantum field theories use HeavisideLorentz units, though the famous textbooks on electrodynamics by Landau and Lifshitz and by Jackson do not.
2 2 ^(2){ }^{2}2 These units are named after the English electrical engineer O. Heaviside (1850-1925) and the Dutch physicist H. A. Lorentz (1853-1928).
3 3 ^(3){ }^{3}3 We use passive transformations in this subject. That is to say, our transformations change the coordinates describing the position of an event, rather than the position of an event, rather than
changing the position of the event itchanging the position of the event it-
self: the latter being an active transforself: the latter being an active transfor-
mation. Sidney Coleman notes that, in mation. Sidney Coleman notes that, in criminal circles, a passive transformation is analogous to an alias (the criminal is an event, after the transformation they remain at the position of the crime in spacetime, but they look different owing to the transformation), while the active transformation is like an alibi (the criminal/event is transformed to a different position in spacetime to the position of the crime).
4 4 ^(4){ }^{4}4 Upstairs components are sometimes called covariant components. We mostly avoid this terminology.

B. 1 Electromagnetic units

In SI units, Maxwell's equations in free space can be written as
(B.1) E = ρ ϵ 0 , × E = B t , B = 0 , × B = μ 0 J + 1 c 2 E t . (B.1) E = ρ ϵ 0 , × E = B t , B = 0 , × B = μ 0 J + 1 c 2 E t . {:(B.1){:[ vec(grad)* vec(E)=(rho)/(epsilon_(0))",", vec(grad)xx vec(E)=-(del( vec(B)))/(del t)","],[ vec(grad)* vec(B)=0",", vec(grad)xx vec(B)=mu_(0) vec(J)+(1)/(c^(2))(del( vec(E)))/(del t).]:}:}\begin{array}{ll} \vec{\nabla} \cdot \vec{E}=\frac{\rho}{\epsilon_{0}}, & \vec{\nabla} \times \vec{E}=-\frac{\partial \vec{B}}{\partial t}, \\ \vec{\nabla} \cdot \vec{B}=0, & \vec{\nabla} \times \vec{B}=\mu_{0} \vec{J}+\frac{1}{c^{2}} \frac{\partial \vec{E}}{\partial t} . \tag{B.1} \end{array}(B.1)E=ρϵ0,×E=Bt,B=0,×B=μ0J+1c2Et.
This appendix contains a summary of some of the choices of conventions and notation we have made in the book.
Although SI units are preferable for many applications in physics, the desire to make our (admittedly often complicated) equations as simple as possible motivates a different choice of units for the discussion of electromagnetism in field theory. 1 1 ^(1){ }^{1}1 We therefore choose the HeavisideLorentz 2 2 ^(2){ }^{2}2 system of units (also known as the 'rationalized Gaussian CGS' system) which can be obtained from SI by setting ϵ 0 = μ 0 = ϵ 0 = μ 0 = epsilon_(0)=mu_(0)=\epsilon_{0}=\mu_{0}=ϵ0=μ0= 1. Thus, the electrostatic potential V ( x ) = q / 4 π ϵ 0 | x | V ( x ) = q / 4 π ϵ 0 | x | V( vec(x))=q//4piepsilon_(0)| vec(x)|V(\vec{x})=q / 4 \pi \epsilon_{0}|\vec{x}|V(x)=q/4πϵ0|x| of SI becomes V ( x ) = q / 4 π | x | V ( x ) = q / 4 π | x | V( vec(x))=q//4pi| vec(x)|V(\vec{x})=q / 4 \pi|\vec{x}|V(x)=q/4π|x| in Heaviside-Lorentz units, and Maxwell's equations can be written as
(B.2) E = ρ , × E = 1 c B t , B = 0 , × B = 1 c ( J + E t ) . (B.2) E = ρ , × E = 1 c B t , B = 0 , × B = 1 c J + E t . {:(B.2){:[ vec(grad)* vec(E)=rho",", vec(grad)xx vec(E)=-(1)/(c)(del( vec(B)))/(del t)","],[ vec(grad)* vec(B)=0",",],[ vec(grad)xx vec(B)=(1)/(c)(( vec(J))+(del( vec(E)))/(del t)).]:}:}\begin{array}{ll} \vec{\nabla} \cdot \vec{E}=\rho, & \vec{\nabla} \times \vec{E}=-\frac{1}{c} \frac{\partial \vec{B}}{\partial t}, \\ \vec{\nabla} \cdot \vec{B}=0, & \tag{B.2}\\ \vec{\nabla} \times \vec{B}=\frac{1}{c}\left(\vec{J}+\frac{\partial \vec{E}}{\partial t}\right) . \end{array}(B.2)E=ρ,×E=1cBt,B=0,×B=1c(J+Et).
Using our other choice of c = 1 c = 1 c=1c=1c=1 obviously removes the factors of c c ccc too.

B. 2 Vectors, 1-forms and tensors

In a particular basis, a vector is described by a set of components. If the basis is rotated, then the components will change, but the length of the vector will be unchanged. 3 3 ^(3){ }^{3}3 Three-vectors (or 3 -vectors) have three spatial components [such as ( A x , A y , A z ) A x , A y , A z (A^(x),A^(y),A^(z))\left(A^{x}, A^{y}, A^{z}\right)(Ax,Ay,Az) in a Cartesian coordinate system] and denoted by a letter with an arrow on top, such as A A vec(A)\vec{A}A or p p vec(p)\vec{p}p. The components of 3 -vectors are listed with a Roman index taken from the middle of the alphabet: e.g. A i A i A^(i)A^{i}Ai, with i = 1 , 2 , 3 i = 1 , 2 , 3 i=1,2,3i=1,2,3i=1,2,3 so that we can write components A i = ( A 1 , A 2 , A 3 ) A i = A 1 , A 2 , A 3 A^(i)=(A^(1),A^(2),A^(3))A^{i}=\left(A^{1}, A^{2}, A^{3}\right)Ai=(A1,A2,A3). We sometimes use the names of coordinates for the components: e.g. A i = ( A x , A y , A z ) A i = A x , A y , A z A^(i)=(A^(x),A^(y),A^(z))A^{i}=\left(A^{x}, A^{y}, A^{z}\right)Ai=(Ax,Ay,Az). Component labels for vectors are always written in the upstairs position 4 4 ^(4){ }^{4}4 (e.g. A i ) A i {:A^(i))\left.A^{i}\right)Ai) and never downstairs ( A i ) A i (A_(i))\left(A_{i}\right)(Ai).
In most applications, we deal with ( 3 + 1 ) ( 3 + 1 ) (3+1)(3+1)(3+1)-dimensional spacetime. A four-vector (or 4 -vector) that lives in this spacetime is a vector-valued object with a single timelike component and three spacelike components, which themselves form a three-vector. Four-vectors are displayed in bold script (e.g. v). All bold-script quantities are coordinate free, existing independently of a specific basis. When referred to a basis, four-vector components are given a Greek index: for example, v μ v μ v^(mu)v^{\mu}vμ where μ = 0 , 1 , 2 , 3 μ = 0 , 1 , 2 , 3 mu=0,1,2,3\mu=0,1,2,3μ=0,1,2,3. We write v μ = ( v 0 , v 1 , v 2 , v 3 ) v μ = v 0 , v 1 , v 2 , v 3 v^(mu)=(v^(0),v^(1),v^(2),v^(3))v^{\mu}=\left(v^{0}, v^{1}, v^{2}, v^{3}\right)vμ=(v0,v1,v2,v3) or ( v 0 , v i ) v 0 , v i (v^(0),v^(i))\left(v^{0}, v^{i}\right)(v0,vi) or ( v 0 , v ) v 0 , v (v^(0),( vec(v)))\left(v^{0}, \vec{v}\right)(v0,v). The zeroth component, v 0 v 0 v^(0)v^{0}v0, is the timelike part. Basis vectors are written as e μ e μ e_(mu)\boldsymbol{e}_{\mu}eμ (and sometimes / x μ ) / x μ {: del//delx^(mu))\left.\partial / \partial x^{\mu}\right)/xμ), so we can write a vector in terms of its components as v = v μ e μ v = v μ e μ v=v^(mu)e_(mu)\boldsymbol{v}=v^{\mu} \boldsymbol{e}_{\mu}v=vμeμ, with a bold part on both sides of the equality. 5 5 ^(5){ }^{5}5
The vector's natural partner is the 1-form. These are written in frameindependent form using bold type with a tilde, e.g. A ~ A ~ tilde(A)\tilde{A}A~. Like vectors they can be split into components and basis 1 -forms, the latter written as ω μ ω μ omega^(mu)\boldsymbol{\omega}^{\mu}ωμ (and sometimes d x μ d x μ dx^(mu)\boldsymbol{d} x^{\mu}dxμ ). In terms of components and basis 1 -forms, we write A ~ = A μ ω μ A ~ = A μ ω μ tilde(A)=A_(mu)omega^(mu)\tilde{\boldsymbol{A}}=A_{\mu} \boldsymbol{\omega}^{\mu}A~=Aμωμ. Components of 1-forms always have the index written in the down position. 6 6 ^(6){ }^{6}6 An example of a familiar 1 -form is the gradient of a function f ( x μ ) f x μ f(x^(mu))f\left(x^{\mu}\right)f(xμ), whose components are the derivatives f x μ f x μ (del f)/(delx^(mu))\frac{\partial f}{\partial x^{\mu}}fxμ, which is sometimes written as μ f μ f del_(mu)f\partial_{\mu} fμf and sometimes written using the comma notation such that f x μ = f , μ f x μ = f , μ (del f)/(delx^(mu))=f_(,mu)\frac{\partial f}{\partial x^{\mu}}=f_{, \mu}fxμ=f,μ.
We use the Einstein convention that all indices repeated in both an up and down position are summed over. Inner products between 1 -forms and tensors are written with angle brackets: A ~ , v = A μ v μ A ~ , v = A μ v μ (: tilde(A),v:)=A_(mu)v^(mu)\langle\tilde{\boldsymbol{A}}, \boldsymbol{v}\rangle=A_{\mu} v^{\mu}A~,v=Aμvμ. Dot products (or, equivalently, scalar products) between two vectors are written as v u = g μ ν v μ u ν v u = g μ ν v μ u ν v*u=g_(mu nu)v^(mu)u^(nu)\boldsymbol{v} \cdot \boldsymbol{u}=g_{\mu \nu} v^{\mu} u^{\nu}vu=gμνvμuν, where g μ ν g μ ν g_(mu nu)g_{\mu \nu}gμν are the components of the metric.
Tensors are treated as slot machines and given bold symbols like T ( T ( T(\boldsymbol{T}(T(, ) . ) . ).) .). Their valence is specified separately in the form ( n , m ) ( n , m ) (n,m)(n, m)(n,m), meaning n n nnn slots for 1 -forms and m m mmm slots for vectors. 7 7 ^(7){ }^{7}7 When the slots are filled, the tensor outputs a number. Components can be extracted using the basis vectors e μ e μ e_(mu)\boldsymbol{e}_{\mu}eμ and basis 1-forms ω ν ω ν omega^(nu)\boldsymbol{\omega}^{\nu}ων via equations such as the following for a (2,2) tensor: S ( ω μ , ω ν , e α , e β ) = S μ ν α β S ω μ , ω ν , e α , e β = S μ ν α β S(omega^(mu),omega^(nu),e_(alpha),e_(beta))=S^(mu nu)_(alpha beta)\boldsymbol{S}\left(\boldsymbol{\omega}^{\mu}, \boldsymbol{\omega}^{\nu}, \boldsymbol{e}_{\alpha}, \boldsymbol{e}_{\beta}\right)=S^{\mu \nu}{ }_{\alpha \beta}S(ωμ,ων,eα,eβ)=Sμναβ. Tensors can be combined using outer products denoted ox\otimes, or wedge products denoted ^^\wedge, with the relationship v u = v u u v v u = v u u v v^^u=v ox u-u ox v\boldsymbol{v} \wedge \boldsymbol{u}=\boldsymbol{v} \otimes \boldsymbol{u}-\boldsymbol{u} \otimes \boldsymbol{v}vu=vuuv. Tensors can also be written in terms of their components using this notation
(B.3) S = S α β μ ν ( e μ e μ ω α ω β ) (B.3) S = S α β μ ν e μ e μ ω α ω β {:(B.3)S=S_(alpha beta)^(mu nu)(e_(mu)oxe_(mu)oxomega^(alpha)oxomega^(beta)):}\begin{equation*} \boldsymbol{S}=S_{\alpha \beta}^{\mu \nu}\left(\boldsymbol{e}_{\mu} \otimes \boldsymbol{e}_{\mu} \otimes \boldsymbol{\omega}^{\alpha} \otimes \boldsymbol{\omega}^{\beta}\right) \tag{B.3} \end{equation*}(B.3)S=Sαβμν(eμeμωαωβ)
Symmetrization of components is denoted with a round bracket, such as
(B.4) T ( α β ) = 1 2 ( T α β + T β α ) (B.4) T ( α β ) = 1 2 T α β + T β α {:(B.4)T^((alpha beta))=(1)/(2)(T^(alpha beta)+T^(beta alpha)):}\begin{equation*} T^{(\alpha \beta)}=\frac{1}{2}\left(T^{\alpha \beta}+T^{\beta \alpha}\right) \tag{B.4} \end{equation*}(B.4)T(αβ)=12(Tαβ+Tβα)
Antisymmetrization of components is denoted with a square bracket, such as
(B.5) T [ α β ] = 1 2 ( T α β T β α ) (B.5) T [ α β ] = 1 2 T α β T β α {:(B.5)T^([alpha beta])=(1)/(2)(T^(alpha beta)-T^(beta alpha)):}\begin{equation*} T^{[\alpha \beta]}=\frac{1}{2}\left(T^{\alpha \beta}-T^{\beta \alpha}\right) \tag{B.5} \end{equation*}(B.5)T[αβ]=12(TαβTβα)
The trace of a tensor is denoted by an italic letter, 8 8 ^(8){ }^{8}8 e.g. T = T μ μ T = T μ μ T=T^(mu)_(mu)T=T^{\mu}{ }_{\mu}T=Tμμ. Tensor components are sometimes denoted with the coordinates (e.g. μ = t , r , θ , ϕ ) μ = t , r , θ , ϕ ) mu=t,r,theta,phi)\mu=t, r, \theta, \phi)μ=t,r,θ,ϕ) and sometimes, equivalently, numbers (e.g. μ = 1 4 μ = 1 4 mu=1dots4\mu=1 \ldots 4μ=14 ). Using the latter, ordered indices | μ ν | | μ ν | |mu nu||\mu \nu||μν| are arranged such that μ < ν μ < ν mu < nu\mu<\nuμ<ν.
5 5 ^(5){ }^{5}5 Also in (3+1) dimensions we generally use V V V\mathcal{V}V to denote a 4 -volume and V V VVV for a 3 -volume. The invariant 4 -volume is usually d V d V dV\mathrm{d} \mathcal{V}dV and the invariant 3 -volume usually d V d V dV\mathrm{d} \mathcal{V}dV and the invariant 3 -volume
is d Σ d Σ dSigma\mathrm{d} \SigmadΣ. Some other texts use d Ω d Ω dOmega\mathrm{d} \OmegadΩ for the invariant 4 -volume, but we reserve the invariant 4 -volume, but we reserve
d Ω 2 = d θ 2 + sin 2 θ d ϕ 2 d Ω 2 = d θ 2 + sin 2 θ d ϕ 2 dOmega^(2)=dtheta^(2)+sin^(2)thetadphi^(2)\mathrm{d} \Omega^{2}=\mathrm{d} \theta^{2}+\sin ^{2} \theta \mathrm{~d} \phi^{2}dΩ2=dθ2+sin2θ dϕ2 for the angular part of the spherical line element.
6 6 ^(6){ }^{6}6 These are sometimes called contravariant components.
7 7 ^(7){ }^{7}7 On the few occasions we want to make an argument about a general matrix, rather than about a tensor, we denote the matrix X X _ X_\underline{\boldsymbol{X}}X.
8 8 ^(8){ }^{8}8 The most important tensor in this subject is the metric. This is a ( 0 , 2 ) ( 0 , 2 ) (0,2)(0,2)(0,2) tensor g ( g ( g(\boldsymbol{g}(g(, ) w i t h c o m p o n e n t s g μ ν = ) w i t h c o m p o n e n t s g μ ν = )withcomponentsg_(mu nu)=) with components g_{\mu \nu}=)withcomponentsgμν= g ( e μ , e ν ) = e μ e ν g e μ , e ν = e μ e ν g(e_(mu),e_(nu))=e_(mu)*e_(nu)\boldsymbol{g}\left(\boldsymbol{e}_{\mu}, \boldsymbol{e}_{\nu}\right)=\boldsymbol{e}_{\mu} \cdot \boldsymbol{e}_{\nu}g(eμ,eν)=eμeν. In an exception to our rules, the determinant of the metric (not the trace) is denoted g g ggg. We use the signature ( + + + ) ( + + + ) (-+++)(-+++)(+++) and specify the components of diagonal matrices by saying, for example, that the components of the Minkowski tensor are η μ ν = diag ( 1 , 1 , 1 , 1 ) η μ ν = diag ( 1 , 1 , 1 , 1 ) eta_(mu nu)=diag(-1,1,1,1)\eta_{\mu \nu}=\operatorname{diag}(-1,1,1,1)ημν=diag(1,1,1,1). Indices are raised and lowered with the components of the metric.
9 9 ^(9){ }^{9}9 If, as in Chapter 30, we do write points on the world line in terms of a set of displacement vectors X = X μ e μ X = X μ e μ X=X^(mu)e_(mu)\boldsymbol{X}=X^{\mu} \boldsymbol{e}_{\mu}X=Xμeμ then we could write the tangent as
u = d X ( τ ) d τ (B.6) = d x μ d τ X ( τ ) x μ u = d X ( τ ) d τ (B.6) = d x μ d τ X ( τ ) x μ {:[u=(dX(tau))/(dtau)],[(B.6)=(dx^(mu))/(dtau)*(del X(tau))/(delx^(mu))]:}\begin{align*} \boldsymbol{u} & =\frac{\mathrm{d} \boldsymbol{X}(\tau)}{\mathrm{d} \tau} \\ & =\frac{\mathrm{d} x^{\mu}}{\mathrm{d} \tau} \cdot \frac{\partial \boldsymbol{X}(\tau)}{\partial x^{\mu}} \tag{B.6} \end{align*}u=dX(τ)dτ(B.6)=dxμdτX(τ)xμ
so that the components are u μ = x μ τ u μ = x μ τ u^(mu)=(delx^(mu))/(del tau)u^{\mu}=\frac{\partial x^{\mu}}{\partial \tau}uμ=xμτ and the basis vectors on the curve are e μ = X ( τ ) x μ e μ = X ( τ ) x μ e_(mu)=(del X(tau))/(delx^(mu))\boldsymbol{e}_{\mu}=\frac{\partial \boldsymbol{X}(\tau)}{\partial x^{\mu}}eμ=X(τ)xμ. However, the displace ment vector does not transform according to the tensor transformation law so is not useful for general relativity. The modern way of looking at vectors (Chapter 31) is not to invoke the displacement vectors and instead specify the tangent field as
(B.7) u ( x ) = d x μ d τ x μ (B.7) u ( x ) = d x μ d τ x μ {:(B.7)u(x)=(dx^(mu))/(dtau)(del)/(delx^(mu)):}\begin{equation*} \boldsymbol{u}(x)=\frac{\mathrm{d} x^{\mu}}{\mathrm{d} \tau} \frac{\partial}{\partial x^{\mu}} \tag{B.7} \end{equation*}(B.7)u(x)=dxμdτxμ
so that the basis vectors are written as e μ = x μ e μ = x μ e_(mu)=(del)/(delx^(mu))\boldsymbol{e}_{\mu}=\frac{\partial}{\partial x^{\mu}}eμ=xμ.
Tensor fields are functions of position. We write a vector field v ( x ) v ( x ) v(x)\boldsymbol{v}(x)v(x), meaning that at a point x x xxx we output a vector v v v\boldsymbol{v}v. The point here could be an abstract point in a manifold P P P\mathcal{P}P or the coordinates of this point x μ ( P ) x μ ( P ) x^(mu)(P)x^{\mu}(\mathcal{P})xμ(P). This is intended to prevent any confusion with the slots carried by the tensor (e.g. the single slot of a vector field that takes a 1 -form).
Position vectors, interpreted as pointing between points in spacetime, are not very useful in curved spacetime. Instead, we usually specify a general point in spacetime P P P\mathcal{P}P or its coordinate x μ ( P ) x μ ( P ) x^(mu)(P)x^{\mu}(\mathcal{P})xμ(P), which are not treated as the components of a vector. 9 9 ^(9){ }^{9}9 The most important vector field in relativity is the velocity, which provides the tangents to a world line x μ ( τ ) x μ ( τ ) x^(mu)(tau)x^{\mu}(\tau)xμ(τ), which is a curve parametrized by an affine parameter such as the proper time τ τ tau\tauτ. The velocity field is given by u ( x ) = ( d x μ ( τ ) d τ ) e μ u ( x ) = d x μ ( τ ) d τ e μ u(x)=((dx^(mu)(tau))/(dtau))e_(mu)\boldsymbol{u}(x)=\left(\frac{\mathrm{d} x^{\mu}(\tau)}{\mathrm{d} \tau}\right) \boldsymbol{e}_{\mu}u(x)=(dxμ(τ)dτ)eμ, with the property u u = 1 u u = 1 u*u=-1\boldsymbol{u} \cdot \boldsymbol{u}=-1uu=1.
In the orthonormal frame, we write components with a hat. So a vector is written as A = A α ^ e α ^ A = A α ^ e α ^ A=A^( hat(alpha))e_( hat(alpha))\boldsymbol{A}=A^{\hat{\alpha}} \boldsymbol{e}_{\hat{\alpha}}A=Aα^eα^. Indices in an orthonormal frame are raised and lowered with the Minkowski metric with components η μ ν = diag ( 1 , 1 , 1 , 1 ) η μ ν = diag ( 1 , 1 , 1 , 1 ) eta_(mu nu)=diag(-1,1,1,1)\eta_{\mu \nu}=\operatorname{diag}(-1,1,1,1)ημν=diag(1,1,1,1). To translate between the orthonormal frame and a coordinate frame, we use the components of a vielbein, written using brackets in expressions such as
(B.8) A α ^ = ( e μ ) α ^ A μ , A μ = ( e α ^ ) μ A α ^ Z μ = ( e μ ) α ^ Z α ^ , Z α ^ = ( e α ^ ) μ Z μ (B.8) A α ^ = e μ α ^ A μ , A μ = e α ^ μ A α ^ Z μ = e μ α ^ Z α ^ , Z α ^ = e α ^ μ Z μ {:(B.8){:[A^( hat(alpha))=(e_(mu))^( hat(alpha))A^(mu)",",A^(mu)=(e_( hat(alpha)))^(mu)A^( hat(alpha))],[Z_(mu)=(e_(mu))^( hat(alpha))Z_( hat(alpha))",",Z_( hat(alpha))=(e_( hat(alpha)))^(mu)Z_(mu)]:}:}\begin{array}{ll} A^{\hat{\alpha}}=\left(\boldsymbol{e}_{\mu}\right)^{\hat{\alpha}} A^{\mu}, & A^{\mu}=\left(\boldsymbol{e}_{\hat{\alpha}}\right)^{\mu} A^{\hat{\alpha}} \\ Z_{\mu}=\left(\boldsymbol{e}_{\mu}\right)^{\hat{\alpha}} Z_{\hat{\alpha}}, & Z_{\hat{\alpha}}=\left(\boldsymbol{e}_{\hat{\alpha}}\right)^{\mu} Z_{\mu} \tag{B.8} \end{array}(B.8)Aα^=(eμ)α^Aμ,Aμ=(eα^)μAα^Zμ=(eμ)α^Zα^,Zα^=(eα^)μZμ
For a diagonal metric, with non-zero components g μ μ g μ μ g_(mu mu)g_{\mu \mu}gμμ, we have the useful square-root rule ( e μ ) μ ^ = | g μ μ | e μ μ ^ = g μ μ (e_(mu))^( hat(mu))=sqrt(|g_(mu mu)|)\left(e_{\mu}\right)^{\hat{\mu}}=\sqrt{\left|g_{\mu \mu}\right|}(eμ)μ^=|gμμ|, where no summation is implied.

B. 3 Covariant derivatives

The most useful derivative in relativity is the covariant derivative, which is written in frame-independent form as u u grad_(u)\boldsymbol{\nabla}_{\boldsymbol{u}}u, which is equivalent to u = u = grad_(u)=\boldsymbol{\nabla}_{\boldsymbol{u}}=u= u e μ u e μ u*grad_(e_(mu))\boldsymbol{u} \cdot \boldsymbol{\nabla}_{\boldsymbol{e}_{\mu}}ueμ, where u u u\boldsymbol{u}u is a vector. This is a directional derivative, taken along the direction of the vector u u u\boldsymbol{u}u. When the direction is given in terms of a basis vector we write e μ = μ e μ = μ grad_(e_(mu))=grad_(mu)\nabla_{e_{\mu}}=\nabla_{\mu}eμ=μ. Confusingly, this is not a component expression: the μ μ mu\muμ subscript is short for e μ e μ e_(mu)\boldsymbol{e}_{\mu}eμ, where μ μ mu\muμ labels the direction along which the derivative is taken. In terms of components and a basis, the covariant derivative can be written as
(B.9) μ v = ( μ v ) α e α = v ; μ α e α (B.9) μ v = μ v α e α = v ; μ α e α {:(B.9)grad_(mu)v=(grad_(mu)v)^(alpha)e_(alpha)=v_(;mu)^(alpha)e_(alpha):}\begin{equation*} \boldsymbol{\nabla}_{\mu} \boldsymbol{v}=\left(\boldsymbol{\nabla}_{\mu} \boldsymbol{v}\right)^{\alpha} e_{\alpha}=v_{; \mu}^{\alpha} e_{\alpha} \tag{B.9} \end{equation*}(B.9)μv=(μv)αeα=v;μαeα
where the final expression uses semicolon notation. The latter is a component notation defined as
(B.10) v ; μ α = v α x μ + Γ μ ν α v ν (B.10) v ; μ α = v α x μ + Γ μ ν α v ν {:(B.10)v_(;mu)^(alpha)=(delv^(alpha))/(delx^(mu))+Gamma_(mu nu)^(alpha)v^(nu):}\begin{equation*} v_{; \mu}^{\alpha}=\frac{\partial v^{\alpha}}{\partial x^{\mu}}+\Gamma_{\mu \nu}^{\alpha} v^{\nu} \tag{B.10} \end{equation*}(B.10)v;μα=vαxμ+Γμναvν
We also use a further notation for the covariant derivative made along a curve x μ ( τ ) x μ ( τ ) x^(mu)(tau)x^{\mu}(\tau)xμ(τ) with tangent u ( x ) u ( x ) u(x)\boldsymbol{u}(x)u(x), which we write as D v / d τ = u v D v / d τ = u v Dv//dtau=grad_(u)v\mathrm{D} \boldsymbol{v} / \mathrm{d} \tau=\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{v}Dv/dτ=uv and
(B.11) ( D v d τ ) α = u μ ( v α x μ + Γ μ ν α v ν ) = u μ v ; μ α (B.11) D v d τ α = u μ v α x μ + Γ μ ν α v ν = u μ v ; μ α {:(B.11)((Dv)/((d)tau))^(alpha)=u^(mu)((delv^(alpha))/(delx^(mu))+Gamma_(mu nu)^(alpha)v^(nu))=u^(mu)v_(;mu)^(alpha):}\begin{equation*} \left(\frac{\mathrm{D} \boldsymbol{v}}{\mathrm{~d} \tau}\right)^{\alpha}=u^{\mu}\left(\frac{\partial v^{\alpha}}{\partial x^{\mu}}+\Gamma_{\mu \nu}^{\alpha} v^{\nu}\right)=u^{\mu} v_{; \mu}^{\alpha} \tag{B.11} \end{equation*}(B.11)(Dv dτ)α=uμ(vαxμ+Γμναvν)=uμv;μα
where the velocity u u u\boldsymbol{u}u is tangent to the curve parametrized by τ . 10 τ . 10 tau.^(10)\tau .{ }^{10}τ.10

Manifolds and bundles

Life is a short affair; We should try to make it smooth, and free from strife.
Euripides (c. 480 480 480-480-480 c. 406 )
Given a shape or a space to understand, such as a two-dimensional spherical surface, we are usually tempted to embed it in a higher dimensional Euclidean space (e.g. three-dimensional space in this case) in order to examine its structure. However, in studying the spacetimes of general relativity, it is certainly not given that the spacetime of our Universe actually lives in some higher dimensional Euclidean space. We must therefore come to terms with a more intrinsic, geometrical description of the fabric of spacetime in terms of a manifold, without relying on the artifice of embedding it in a higher dimensional space. A manifold has only a little mathematical structure of its own. We only insist that a manifold be a smooth space without awkward discontinuities or unusual joints. Manifolds fit naturally with physical models described in terms of classical field theory which rely on this notion of smoothness. Thus, when we examine singularities, we can characterize them as places where the smooth manifold description breaks down. 1 1 ^(1){ }^{1}1
A good working definition of a manifold is a space that looks locally flat and Euclidean. This constrains the space to change smoothly, since a space full of discontinuities cannot look Euclidean at each of its points. Coordinates, functions, and curves can be defined on manifolds. We can perform calculus on manifolds, use them to define vectors and, if we choose, define metrics on them in order to work out lengths and angles. Nature, as explained by general relativity, seems to be based on the metric and so these two separate ingredients, manifold and metric, form the basis of the geometrical description of Nature.
The manifold point of view is a natural one for describing the geometry of physics. We call ordinary three-dimensional space, the manifold R 3 R 3 R^(3)\mathbb{R}^{3}R3 : it is simply the space in which 3 -vectors live. Space can then be thought of as the manifold R 3 R 3 R^(3)\mathbb{R}^{3}R3 with a flat metric d ( d ( d(\boldsymbol{d}(d(, ) d e f i n e d o n i t t h a t c a n b e ) d e f i n e d o n i t t h a t c a n b e )definedonitthatcanbe) defined on it that can be)definedonitthatcanbe used to measure lengths. Special relativity asserts that spacetime is the manifold R 4 R 4 R^(4)\mathbb{R}^{4}R4 (i.e. the space of 4 -vectors) with a flat metric η ( η ( eta(\boldsymbol{\eta}(η(, ) d e f i n e d ) d e f i n e d )defined) defined)defined on it. In general relativity, spacetime is a manifold, usually called M M M\mathcal{M}M, on which a Lorentz metric g ( g ( g(\boldsymbol{g}(g(, ) i s d e f i n e d . T h e c u r v a t u r e o f t h e s p a c e t i m e ) i s d e f i n e d . T h e c u r v a t u r e o f t h e s p a c e t i m e )isdefined.Thecurvatureofthespacetime) is defined. The curvature of the spacetime)isdefined.Thecurvatureofthespacetime is related to the matter distribution in spacetime via Einstein's equation.
In this book, we have been doing our physics on a (pseudo) Riemann manifold, which possesses a connection and a metric. In fact, the Rie-

C. 1 Preliminaries 566
C. 2 Maps and functions 567
C. 3 One-to-one, into, and onto 567
C. 4 Continuous maps 568
C. 5 Manifolds, coordinates, and charts 569 C. 6 Functions on the manifold 571
C. 7 Differentiation on the manifold
572
C. 8 Compact regions 5 7 5 5 7 5 575\mathbf{5 7 5}575
C. 9 Curves 575
C. 10 Tangent spaces 576
C. 11 Fibre bundles 578
Chapter summary 580
1 1 ^(1){ }^{1}1 The material in this appendix lies behind the mathematics presented in this book, particularly the material on differential geometry. Although we have tried to minimize its use in the main body of the book, this mathematics provides the reason why modern general relativity looks the way it does, and is therefore used in many modern and is therefore used in many modern ple, the study of singularities used to understand the structure of black holes and cosmology is especially reliant on the use of many of the ideas introduced here.
Fig. C. 1 The Riemann manifold ( M , g ) ( M , g ) (M,g)(\mathcal{M}, \boldsymbol{g})(M,g), with its metric structure, exists at the top of a pyramid of concepts in mathematics.
2 2 ^(2){ }^{2}2 For those not inclined to venture any further at this stage, here's an executive summary of the content of this appendix: general relativity takes place mathematical structure that has this mathematical structure that has this smoothness is a manifold. On a manifold, instead of a coordinate transformation we have the diffeomorphism and instead of the idea of a boundary we have the notion of compactness. Tangent vectors live in a manifold called a tangent space. The combination of a manifold and a tangent space is known as a fibre bundle.
(a)

(b)
(c)

(d)
Fig. C. 2 (a) Open interval; (b) closed interval; (c) the union of A A AAA and B B BBB; (d) the intersection of A A AAA and B B BBB; (e) A A AAA is a subset of B B BBB; (f) an open ball in R 2 R 2 R^(2)\mathbb{R}^{2}R2; a subset of B ; ( f ) B ; ( f ) B;(f)B ;(\mathrm{f})B;(f) an open , which is a
( g ) ( g ) (g)(\mathrm{g})(g) an open cover of a set A A AAA, whin (g) an open cover of a set A A AAA, which is a
subset of R 2 ; ( h ) R 2 ; ( h ) R^(2);(h)\mathbb{R}^{2} ;(\mathrm{h})R2;(h) a non-Hausdorff space.
mann manifold is built upon a series of concepts, as shown in Fig. C.1. In this appendix, we take a step back, forgetting many of the mathematical notions we take for granted, such as the distances and times encoded in the metric. We shall deal with manifolds from the primitive point of view that everything needs to be built from scratch. Taking as little baggage as possible with us, we shall attempt to build a set of concepts suitable to describe the geometrical fabric of the Universe. This topic starts with simple notions of sets and intervals and then introduces the study of manifolds and their structure. We finish by describing some simple concepts of fibres and bundles that lie behind the tangent spaces of differential geometry and the physics of gauges. 2 2 ^(2){ }^{2}2

C. 1 Preliminaries

A space with a metric defined on it is called a metric space. The metric allows us to work out how far points are from other points. We call a space without a metric a topological space. This has less structure, but we tend to describe points in the neighbourhood of other points in terms of parts of the space known as open subsets, which encode its topology. We introduce some of the relevant ideas here and in Fig. C.2. We start with some primitive notions of sets (or collections of objects or points) and intervals (i.e. a subset of points between two end points, or the neighbourhood of points between two end points). A manifold then turns out to be a set with some special properties. Let's start with some definitions:
  • The open interval a < x < b a < x < b a < x < ba<x<ba<x<b, not including the endpoints a a aaa and b b bbb is written ( a , b a , b a,ba, ba,b ) [Fig. C.2(a)]. The closed interval a x b a x b a <= x <= ba \leq x \leq baxb, which includes the endpoints a a aaa and b b bbb is written [ a , b ] [ a , b ] [a,b][a, b][a,b] [Fig. C.2(b)].
  • The expression p A p A p in Ap \in ApA denotes that p p ppp is an element of the set A A AAA.
  • The expression A B A B A uu BA \cup BAB denotes the union of sets A A AAA and B B BBB : the set of objects belonging to A , B A , B A,BA, BA,B or both [Fig. C.2(c)].
  • The expression A B A B A nn BA \cap BAB denotes the intersection of sets A A AAA and B B BBB : the set of objects belonging to both A A AAA and B B BBB [Fig. C.2(d)].
  • The expression A B A B A sub BA \subset BAB denotes that A A AAA is a subset of B B BBB [Fig. C.2(e)].
  • The expression O/\varnothing denotes the empty set, containing no elements.
  • R R R\mathbb{R}R is the set of real numbers.
  • We define R n R n R^(n)\mathbb{R}^{n}Rn to be the n n nnn-dimensional Euclidean space we usually use for vector algebra. A point in R n R n R^(n)\mathbb{R}^{n}Rn is a sequence of real numbers ( x 1 , x 2 , x 3 , x n ) x 1 , x 2 , x 3 , x n (x^(1),x^(2),x^(3),dotsx^(n))\left(x^{1}, x^{2}, x^{3}, \ldots x^{n}\right)(x1,x2,x3,xn), sometimes called an n n nnn-tuple. The space R n R n R^(n)\mathbb{R}^{n}Rn is a metric space: with the distance between points is given by
(C.1) | x y | = [ μ = 1 n ( x μ y μ ) 2 ] 1 2 (C.1) | x y | = μ = 1 n x μ y μ 2 1 2 {:(C.1)|x-y|=[sum_(mu=1)^(n)(x^(mu)-y^(mu))^(2)]^((1)/(2)):}\begin{equation*} |x-y|=\left[\sum_{\mu=1}^{n}\left(x^{\mu}-y^{\mu}\right)^{2}\right]^{\frac{1}{2}} \tag{C.1} \end{equation*}(C.1)|xy|=[μ=1n(xμyμ)2]12
An open ball in R n R n R^(n)\mathbb{R}^{n}Rn of radius r r rrr centred around a point y y yyy consists of the points x x xxx such that | x y | < r | x y | < r |x-y| < r|x-y|<r|xy|<r. This is an example of an open subset of the set R n R n R^(n)\mathbb{R}^{n}Rn [Fig. C.2(f)]. It is some region, usually assumed close to y y yyy, that doesn't include its boundary.
  • A set of points S S SSS of R n R n R^(n)\mathbb{R}^{n}Rn is open if every point in S S SSS has an open neighbourhood entirely within S S SSS. Such a set can be expressed as a union of open balls. Any reasonable chunk of R n R n R^(n)\mathbb{R}^{n}Rn is open if we don't include its boundary in the set. A collection O O OOO of open sets is an open cover of a set A A AAA if every point in A A AAA is in the collection O O OOO [Fig. C.2(g)].
  • The Hausdorff property of a set in R n R n R^(n)\mathbb{R}^{n}Rn is the feature that any two distinct points have neighbourhoods that don't intersect (i.e. any line can be infinitely subdivided). A non-Hausdorff space is typified by branching [Fig. C.2(h)]. We shall only ever deal with Hausdorff spaces.
    With the simple notions defined, we move on to discussing how to relate one element of a space to another element in another space.

C. 2 Maps and functions

The basic tool for examining the properties of the various mathematical structures of use in physics is mapping. A map f f fff from space M M M\mathcal{M}M to space N N N\mathcal{N}N is a rule that associates with an element x x xxx of M M M\mathcal{M}M a unique element y y yyy of N N N\mathcal{N}N. The idea is shown in Fig. C.3. The simplest map is a 3 a 3 a^(3)\mathrm{a}^{3}a3 real function. For such a function, both M M M\mathcal{M}M and N N N\mathcal{N}N are elements of the set R R R\mathbb{R}R (i.e. the set of real numbers). The function f f fff takes an element x x xxx and spits out an element y y yyy. The notation saying that f f fff maps elements in M M M\mathcal{M}M to elements in N N N\mathcal{N}N is written as
(C.2) f : M N (C.2) f : M N {:(C.2)f:MrarrN:}\begin{equation*} f: \mathcal{M} \rightarrow \mathcal{N} \tag{C.2} \end{equation*}(C.2)f:MN
or in terms of the elements themselves
(C.3) f : x y = f ( x ) (C.3) f : x y = f ( x ) {:(C.3)f:x|->y=f(x):}\begin{equation*} f: x \mapsto y=f(x) \tag{C.3} \end{equation*}(C.3)f:xy=f(x)
When a map is a real-valued function of n n nnn variables we write f : R n R f : R n R f:R^(n)rarrRf: \mathbb{R}^{n} \rightarrow \mathbb{R}f:RnR. This simply amounts to saying that we input a n n nnn-tuple ( x 1 , , x n ) x 1 , , x n (x^(1),dots,x^(n))\left(x^{1}, \ldots, x^{n}\right)(x1,,xn) to the function and output a single number. 4 4 ^(4){ }^{4}4
We can combine different mappings. If we have two maps f f fff and g g ggg, f : M N f : M N f:MrarrNf: \mathcal{M} \rightarrow \mathcal{N}f:MN and g : N L g : N L g:NrarrLg: \mathcal{N} \rightarrow \mathcal{L}g:NL, then there is a map called the composition of f f fff and g g ggg denoted g f g f g@fg \circ fgf, which maps M M M\mathcal{M}M to L L L\mathcal{L}L. In ordinary algebra, g f g f g@fg \circ fgf would be written as g ( f ( x ) ) g ( f ( x ) ) g(f(x))g(f(x))g(f(x)).

C. 3 One-to-one, into, and onto

The points, mapped from the subset of points S S SSS in M M M\mathcal{M}M to points in N N N\mathcal{N}N, form a new set T T TTT called the image of S S SSS under f f fff, or f ( S ) f ( S ) f(S)f(S)f(S). The set S S SSS is called the inverse image, i.e. S = f 1 ( T ) S = f 1 ( T ) S=f^(-1)(T)S=f^{-1}(T)S=f1(T). Using the notion of images, we can identify several sorts of mapping.
  • If the map is many-to-one, then the inverse image of some point of N N N\mathcal{N}N is not a single point in M M M\mathcal{M}M [Fig. C.4(a)].
  • If every point in f ( S ) f ( S ) f(S)f(S)f(S) has a unique inverse image point in S S SSS, then f f fff is said to be one-to-one or 1-1 [Fig. C.4(b) and (c)].
  • If a map M N map M N mapMrarrN\operatorname{map} \mathcal{M} \rightarrow \mathcal{N}mapMN is defined for all points in M M M\mathcal{M}M (i.e. S = M S = M S=MS=\mathcal{M}S=M ), then the mapping is from M M M\mathcal{M}M into N N N\mathcal{N}N [Fig. C.4(b and c)].
    3 3 ^(3){ }^{3}3 In many texts, such as this one, the terms map and function are used interchangeable.
Fig. C. 3 A function as a mapping: input an element x x xxx, output an element y = f ( x ) y = f ( x ) y=f(x)y=f(x)y=f(x).
4 4 ^(4){ }^{4}4 Of course, we would usually write this as f ( x 1 , , x n ) = y f x 1 , , x n = y f(x^(1),dots,x^(n))=yf\left(x^{1}, \ldots, x^{n}\right)=yf(x1,,xn)=y.
Fig. C. 4 (a) many-to-one and (b) into mappings. (c) A bijection, which is both 1-1 and onto.
5 5 ^(5){ }^{5}5 Other terms which are used in the mathematical literature to classify functions are:
  • injective = = === one-to-one;
  • surjective = = === onto;
  • bijective = = === both surjective and injective.
    The 'sur' in 'surjective' is from the French sur meaning on. The 'bi' in bijective reminds you that bijective combines two properties.

    (c)

Fig. C. 5 Functions discussed in Example C. 5 .
  • If every point in N N N\mathcal{N}N has an inverse image (not necessarily a unique one), we say it is a mapping from M M M\mathcal{M}M onto N N N\mathcal{N}N.
  • A map that is both 1-1 and onto is called a bijection 5 5 ^(5){ }^{5}5 [Fig. C.4(c)]. Only bijective maps have a unique inverse that is a map, which we denote f 1 : N S f 1 : N S f^(-1):Nrarr Sf^{-1}: \mathcal{N} \rightarrow Sf1:NS. This map is then also bijective.

Example C. 1

Figure C. 5 shows examples of functions, y = f ( x ) y = f ( x ) y=f(x)y=f(x)y=f(x), which can be thought of as maps from R R R\mathbb{R}R to R R R\mathbb{R}R, i.e. f : R R f : R R f:RrarrRf: \mathbb{R} \rightarrow \mathbb{R}f:RR
(i) Figure C.5(a) is 1-1, but not onto.
(ii) Figure C.5(b) is onto but not 1-1.
(iii) Figure C.5(c) is a bijection (i.e. both 1-1 and onto).
(iv) Figure C.5(d) is neither 1-1 nor onto.

C. 4 Continuous maps

The mathematician is often looking for ways to say that two spaces or systems look the same, or at least similar, since this constitutes a useful method of classifying the structure of a space. The way this is done in mathematical physics is via the use of morphisms. Roughly, a morphism is a type of map that preserves structure, allowing us to move between two spaces in order to compare them. Two sorts of morphisms are relevant in geometry: the homeomorphism and the diffeomorphism. The latter is the important morphism for general relativity. We shall first meet the homeomorphism which can be used, in this context, to say that two spaces share the same sort of continuous structure. The diffeomorphism is similar, but also includes the idea that the spaces are differentiable.
A map ϕ : M N ϕ : M N phi:MrarrN\phi: \mathcal{M} \rightarrow \mathcal{N}ϕ:MN is continuous at point x x xxx in M M M\mathcal{M}M if any open set of N N N\mathcal{N}N containing ϕ ( x ) ϕ ( x ) phi(x)\phi(x)ϕ(x) contains the image of an open set of M M M\mathcal{M}M containing x x xxx.
A homeomorphism is a 1-1, onto map from one space to another which is continuous and whose inverse is continuous.
Two spaces with a homeomorphism between them are said to be homeomorphic. Roughly speaking, a homeomorphism preserves the topological properties of a space, so that its 'overall shape' or 'overall structure' is preserved. In this sense, a homeomorphism allows us to say that a space 'looks like' another space. If this seems abstract, then one example of a homeomorphism to keep in mind is a continuous deformation. If you imagine that objects are made from a mouldable clay then if you can deform one object into another without breaking it, punching holes in the clay, gluing disparate parts or healing up holes already there, the objects are homeomorphic. 6 6 ^(6){ }^{6}6
Example C. 2
Some examples of things that are homeomorphisms and things that aren't are the following. [Some terminology (in italics) is explained later in the chapter.]
I The unit disc is the interior of a unit circle. It is homeomorphic to the interior of a unit square, as the disc can be continuously deformed into the square.
II The graph of a differentiable function is homeomorphic to the domain of the function.
III A differentiable parametrization of a curve is a homeomorphism between the domain of parametrization and the curve.
IV A coffee mug and doughnut can be continuously deformed into one another (Fig. C.6). This continuous deformation is one example of a homeomorphism, so the coffee mug and doughnut are homeomorphic
V V V\mathbf{V}V The set R m R m R^(m)\mathbb{R}^{m}Rm is not homeomorphic to R n R n R^(n)\mathbb{R}^{n}Rn if m n m n m!=nm \neq nmn.
VI The Euclidean real line is not homeomorphic to the circle. (This is because the unit circle is compact, but the real line is not.)

C. 5 Manifolds, coordinates, and charts

As we have said, the symbol R n R n R^(n)\mathbb{R}^{n}Rn represents the set of all n n nnn-tuples of real numbers ( x 1 , x 2 , x 3 , , x n ) x 1 , x 2 , x 3 , , x n (x^(1),x^(2),x^(3),dots,x^(n))\left(x^{1}, x^{2}, x^{3}, \ldots, x^{n}\right)(x1,x2,x3,,xn). This is another way of saying it is the ordinary space in which vectors live. It is also known as flat, Euclidean space. We started with the idea that a manifold is a set of points that, locally, looks like R n R n R^(n)\mathbb{R}^{n}Rn. A more precise definition is as follows:
The set M M M\mathcal{M}M is a manifold if each point of M M M\mathcal{M}M has an open neighbourhood that is homeomorphic to an open set of R n R n R^(n)\mathbb{R}^{n}Rn for some n n nnn.
If an object has some point that at no level of magnification can be made to look like the flat space of R n R n R^(n)\mathbb{R}^{n}Rn, then it is not a manifold. Note that a manifold, on its own, does not preserve lengths, angles, or other geometric quantities.
Example C. 3
Some examples of manifolds are the following:
  • The m m mmm-dimensional space R m R m R^(m)\mathbb{R}^{m}Rm itself is a manifold. It looks locally like R m R m R^(m)\mathbb{R}^{m}Rm, after all!
  • The circle S 1 S 1 S^(1)S^{1}S1 is a manifold. It looks locally like R R R\mathbb{R}R [see Fig. C.7(a)].
  • The circle S 1 S 1 S^(1)S^{1}S1 is a manifold. It looks locally like R R R\mathbb{R}R [see Fig. C.7(a)].
  • The sphere S 2 S 2 S^(2)S^{2}S2 is a manifold, looking locally like R 2 [ R 2 [ R^(2)[\mathbb{R}^{2}[R2[ see Fig. C.7(b)].
  • The sphere S 2 S 2 S^(2)S^{2}S2 is a manifold, looking locally like R 2 R 2 R^(2)\mathbb{R}^{2}R2 [see Fig. C.7(b)]
  • The torus T 2 T 2 T^(2)T^{2}T2 is a manifold, looking locally like R 2 R 2 R^(2)\mathbb{R}^{2}R2 [see Fig. C.7(c)].
  • A plane with a line jutting out of it (Fig. 31.1 in Chapter 31) is not a manifold.
The point of intersection never looks smooth at any level of magnification.
  • The double cone (Fig. 31.1) is not a manifold. The position where the apex of one cone touches the other never looks smooth.
A point P P P\mathcal{P}P in an m m mmm-dimensional manifold M M M\mathcal{M}M exists independently of any coordinates. However, we want to be able to identify points like P P P\mathcal{P}P on the manifold using our familiar coordinates ( x 1 , , x m ) x 1 , , x m (x^(1),dots,x^(m))\left(x^{1}, \ldots, x^{m}\right)(x1,,xm) which, to remind you, live in R m R m R^(m)\mathbb{R}^{m}Rm. To do this, we need to map between the manifold M M M\mathcal{M}M and R m R m R^(m)\mathbb{R}^{m}Rm. However, that map won't necessarily be a homeomorphism
Fig. C. 6 A coffee mug can be continuously deformed into a doughnut. Note that this only works because the coffee mug has a handle. A handleless coffee mug is not homeomorphic to the doughmug is not homeomorphic to the dough-
nut because, at some stage of the defornut because, at some stage of the defor-
mation, you would need to rip a hole in mation, you would need to rip a hole in
the 'deformable clay'. Hole-making is not a continuous deformation.
Fig. C. 7 (a) A circle S 1 S 1 S^(1)S^{1}S1 looks locally like R R R\mathbb{R}R. Note that by 'circle' we mean the one-dimensional space that is the boundary of a disc. (b) A sphere S 2 S 2 S^(2)S^{2}S2 looks locally like R 2 R 2 R^(2)\mathbb{R}^{2}R2. Note that by 'sphere' we mean the two-dimensional space that is the boundary of a ball (i.e. what is often called a 'spherical surwhat is often called a 'spherical sur-
face'). (c) A torus T 2 T 2 T^(2)T^{2}T2 is the product face'). (c) A torus T 2 T 2 T^(2)T^{2}T2 is the product
space S 1 × S 1 S 1 × S 1 S^(1)xxS^(1)S^{1} \times S^{1}S1×S1 obtained from two cirspace S 1 × S 1 S 1 × S 1 S^(1)xxS^(1)S^{1} \times S^{1}S1×S1 obtained from two cir-
cles (shown here as the two circles in cles (shown here as the two circles in
bold). It looks locally like R 2 R 2 R^(2)\mathbb{R}^{2}R2. A torus is the space describing the surface of an (edible) doughnut.
Fig. C. 8 An open set U U UUU of the manifold M M M\mathcal{M}M is mapped to the set ϕ ( U ) ϕ ( U ) phi(U)\phi(U)ϕ(U) in R m R m R^(m)\mathbb{R}^{m}Rm.
Fig. C. 9 Coordinate neighbourhoods, chosen to cover the unit circle manifold. (a) The homeomorphism ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 maps a point from the subset U 1 U 1 U_(1)U_{1}U1 on the man ifold to a value θ θ theta\thetaθ (b) We must ex ifold to a value θ θ theta\thetaθ. (b) We must ex clude the point shown from U 1 U 1 U_(1)U_{1}U1, since this could be mapped to both 0 and θ θ theta\thetaθ.
(c) A different subset U 2 U 2 U_(2)U_{2}U2 excludes a different point to that excluded from U 1 U 1 U_(1)U_{1}U1.
since M M M\mathcal{M}M and R m R m R^(m)\mathbb{R}^{m}Rm might have different topologies. However, since M M M\mathcal{M}M looks like R m R m R^(m)\mathbb{R}^{m}Rm locally, a homeomorphism ϕ ϕ phi\phiϕ (which we call a coordinate function) can be constructed which maps between U U UUU and R m R m R^(m)\mathbb{R}^{m}Rm, where U U UUU is called a coordinate neighbourhood; this is an open set of the manifold, U M U M U subMU \subset \mathcal{M}UM, which contains P P P\mathcal{P}P. In mathematical language, ϕ ϕ phi\phiϕ : U R m U R m U rarrR^(m)U \rightarrow \mathbb{R}^{m}URm, and this setup is shown pictorially in Fig. C.8. The coordinate function ϕ ϕ phi\phiϕ is represented by m m mmm real functions of the point P P P\mathcal{P}P, written as { x 1 ( P ) , , x m ( P ) } x 1 ( P ) , , x m ( P ) {x^(1)(P),dots,x^(m)(P)}\left\{x^{1}(\mathcal{P}), \ldots, x^{m}(\mathcal{P})\right\}{x1(P),,xm(P)}. This set is also often called a coordinate, for the sake of brevity. There is lots of scope for confusion here because we generally use x x xxx to represent the functions of P P P\mathcal{P}P, and the coordinates themselves. A simple shorthand equation to keep in mind is that the coordinates are given by
(C.4) x μ = ϕ ( P ) (C.4) x μ = ϕ ( P ) {:(C.4)x^(mu)=phi(P):}\begin{equation*} x^{\mu}=\phi(\mathcal{P}) \tag{C.4} \end{equation*}(C.4)xμ=ϕ(P)
In words: Input a point P P P\mathcal{P}P from U M U M U inMU \in \mathcal{M}UM and output a point x μ x μ x^(mu)x^{\mu}xμ in R m R m R^(m)\mathbb{R}^{m}Rm. Since ϕ ϕ phi\phiϕ is a homeomorphism, and therefore has a unique inverse, we can write things the other way round
(C.5) P = ϕ 1 ( x μ ) (C.5) P = ϕ 1 x μ {:(C.5)P=phi^(-1)(x^(mu)):}\begin{equation*} \mathcal{P}=\phi^{-1}\left(x^{\mu}\right) \tag{C.5} \end{equation*}(C.5)P=ϕ1(xμ)
which, in words, says that we input a coordinate x μ x μ x^(mu)x^{\mu}xμ to ϕ 1 ϕ 1 phi^(-1)\phi^{-1}ϕ1 which outputs a point P P P\mathcal{P}P on the open set U U UUU on the manifold.
We have to focus on the coordinate neighbourhood U M U M U subMU \subset \mathcal{M}UM, rather than on the whole manifold, because the coordinate function ϕ ϕ phi\phiϕ often cannot be 1-1 over the entire manifold (because M M M\mathcal{M}M only looks like R m R m R^(m)\mathbb{R}^{m}Rm locally). We only need to be able map the region of M M M\mathcal{M}M close to the point P P P\mathcal{P}P to R m R m R^(m)\mathbb{R}^{m}Rm using our particular homeomorphism ϕ ϕ phi\phiϕ. We can then map the manifold near other points to R m R m R^(m)\mathbb{R}^{m}Rm using a different homeomorphism.
Example C. 4
The unit circle is a manifold. We set up a map ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 which takes points on the manifold to the coordinate θ θ theta\thetaθ in R R R\mathbb{R}R, assumed to vary between 0 and 2 π 2 π 2pi2 \pi2π [see Fig. C.9(a)]. However, this won't work for all points in the manifold since the point P P P\mathcal{P}P which we describe in R R R\mathbb{R}R as θ = 0 θ = 0 theta=0\theta=0θ=0 is also ascribed the point in R R R\mathbb{R}R called θ = 2 π θ = 2 π theta=2pi\theta=2 \piθ=2π. The map ϕ ϕ phi\phiϕ that takes points on the manifold to R R R\mathbb{R}R is not 1 1 1 1 1-11-111 if we include this point, so we are forced to drop it completely. The subset U 1 U 1 U_(1)U_{1}U1 of M M M\mathcal{M}M, for which ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 is well defined, then encompasses all of the unit circle except this troublesome point P P P\mathcal{P}P [see Fig. C.9(b)].
In general, we may need a collection of open sets, U i U i U_(i)U_{i}Ui, and a collection of maps ϕ i ϕ i phi_(i)\phi_{i}ϕi, to complete our description. We want to be able to patch these U i U i U_(i)U_{i}Ui s together to completely cover the manifold.
Example C. 5
Returning to the unit circle, we can come up with an alternative subset of the manifold M M M\mathcal{M}M, called U 2 U 2 U_(2)U_{2}U2. This one includes the whole of the circle except a point Q P Q P Q!=P\mathcal{Q} \neq \mathcal{P}QP. As drawn in Fig. C.9(c), the missing point is the one that a map ϕ 2 ϕ 2 phi_(2)\phi_{2}ϕ2 would take to π π pi\piπ and/or π π -pi-\piπ. The map ϕ 2 ϕ 2 phi_(2)\phi_{2}ϕ2 does however take all other points on U 2 U 2 U_(2)U_{2}U2 to R R R\mathbb{R}R in the open interval π π -pi-\piπ to π π pi\piπ. We see that by missing Q Q Q\mathcal{Q}Q we do capture the point P P P\mathcal{P}P that was not covered by U 1 U 1 U_(1)U_{1}U1. Taken together, U 1 U 1 U_(1)U_{1}U1 and U 2 U 2 U_(2)U_{2}U2 are seen to cover M M M\mathcal{M}M.
Generally, then, the subsets U i U i U_(i)U_{i}Ui are a family of open subsets that, taken together, cover M M M\mathcal{M}M. The map ϕ i ϕ i phi_(i)\phi_{i}ϕi maps from the subset U i U i U_(i)U_{i}Ui onto an open subset of R m R m R^(m)\mathbb{R}^{m}Rm. The subset U i U i U_(i)U_{i}Ui is called a coordinate neighbourhood. The pair ( U i , ϕ i ) U i , ϕ i (U_(i),phi_(i))\left(U_{i}, \phi_{i}\right)(Ui,ϕi) is called a chart. 7 7 ^(7){ }^{7}7 The whole family of charts { ( U i , ϕ i ) } U i , ϕ i {(U_(i),phi_(i))}\left\{\left(U_{i}, \phi_{i}\right)\right\}{(Ui,ϕi)} is called an atlas.
Example C. 6
Consider two examples of spaces:
(i) m m mmm-dimensional Euclidean space. A single chart covers all of this space.
(ii) One-dimensional space. There are two possible manifolds: the line R 1 R 1 R^(1)\mathbb{R}^{1}R1 and the circle S 1 S 1 S^(1)S^{1}S1. A single chart covers the line. As we saw before, (at least) two charts are needed to cover the circle.
In having more than one set of coordinates (i.e. more than one chart), we do ask that they are compatible. Consider a manifold M M M\mathcal{M}M with overlapping subsets U U UUU and V V VVV (Fig. C.10). The point P P P\mathcal{P}P lies in the overlapping region. Homeomorphisms are defined such that ϕ : U R m ϕ : U R m phi:U rarrR^(m)\phi: U \rightarrow \mathbb{R}^{m}ϕ:URm and ψ : V R m ψ : V R m psi:V rarrR^(m)\psi: V \rightarrow \mathbb{R}^{m}ψ:VRm. We have charts ( U , ϕ ) ( U , ϕ ) (U,phi)(U, \phi)(U,ϕ) and ( V , ψ ) ( V , ψ ) (V,psi)(V, \psi)(V,ψ) and write ϕ ( P ) = x μ ϕ ( P ) = x μ phi(P)=x^(mu)\phi(\mathcal{P})=x^{\mu}ϕ(P)=xμ and ψ ( P ) = y μ ψ ( P ) = y μ psi(P)=y^(mu)\psi(\mathcal{P})=y^{\mu}ψ(P)=yμ. By defining a composite map that combines the two homeomorphisms, we are able to recover the idea of a coordinate transformation. In order to get from x μ x μ x^(mu)x^{\mu}xμ to y μ y μ y^(mu)y^{\mu}yμ (and motivated by the diagram in Fig. C.10), we write
(C.6) y μ = ψ ϕ 1 ( x μ ) (C.6) y μ = ψ ϕ 1 x μ {:(C.6)y^(mu)=psi@phi^(-1)(x^(mu)):}\begin{equation*} y^{\mu}=\psi \circ \phi^{-1}\left(x^{\mu}\right) \tag{C.6} \end{equation*}(C.6)yμ=ψϕ1(xμ)
In words, input a coordinate x μ x μ x^(mu)x^{\mu}xμ that is taken to a point P P P\mathcal{P}P on M M M\mathcal{M}M and output a coordinate y μ y μ y^(mu)y^{\mu}yμ that corresponds to the same point.
We are now in the position to put everything together and write down a more technical description of a manifold. Of course, there's very little here we haven't seen earlier in words.
A manifold M M M\mathcal{M}M has the following properties
  • Each P M P M PinM\mathcal{P} \in \mathcal{M}PM lies in at least one open set U i U i U_(i)U_{i}Ui (i.e. the { U i } U i {U_(i)}\left\{U_{i}\right\}{Ui} cover M M M\mathcal{M}M ).
  • For each i i iii there is a homeomorphism ϕ : U i ϕ ( U i ) ϕ : U i ϕ U i phi:U_(i)rarr phi(U_(i))\phi: U_{i} \rightarrow \phi\left(U_{i}\right)ϕ:Uiϕ(Ui), where ϕ ( U i ) ϕ U i phi(U_(i))\phi\left(U_{i}\right)ϕ(Ui) is an open subset of R m R m R^(m)\mathbb{R}^{m}Rm.
  • Where any two sets U i U i U_(i)U_{i}Ui and U j U j U_(j)U_{j}Uj overlap, we have a composite map ϕ i ϕ j 1 ϕ i ϕ j 1 phi_(i)@phi_(j)^(-1)\phi_{i} \circ \phi_{j}^{-1}ϕiϕj1 which takes points in ϕ i ( U i U j ) R m ϕ i U i U j R m phi_(i)(U_(i)nnU_(j))subR^(m)\phi_{i}\left(U_{i} \cap U_{j}\right) \subset \mathbb{R}^{m}ϕi(UiUj)Rm to points in ϕ j ( U i ϕ j U i phi_(j)(U_(i)nn:}\phi_{j}\left(U_{i} \cap\right.ϕj(Ui U j ) R m U j R m {:U_(j))subR^(m)\left.U_{j}\right) \subset \mathbb{R}^{m}Uj)Rm.

C. 6 Functions on the manifold

As introduced above, a function can be thought of as a map. In addition, a function can be thought of as living on a manifold. However, we really only have access to coordinates in R m R m R^(m)\mathbb{R}^{m}Rm and so we often want to input and output these coordinates when interacting with objects defined on the manifold. Let's consider the function f : M R f : M R f:MrarrRf: \mathcal{M} \rightarrow \mathbb{R}f:MR, that inputs some point P P P\mathcal{P}P in the m m mmm-dimensional manifold M M M\mathcal{M}M, and outputs a number
7 7 ^(7){ }^{7}7 A chart is often called a coordinate system by physicists.
Fig. C. 10 A coordinate transformation.
Fig. C. 11 A function as a composite map f ϕ 1 : ϕ ( U ) R map f ϕ 1 : ϕ ( U ) R map f@phi^(-1):phi(U)rarrR\operatorname{map} f \circ \phi^{-1}: \phi(U) \rightarrow \mathbb{R}mapfϕ1:ϕ(U)R.
Fig. C. 12 A function that maps between manifolds.
in R R R\mathbb{R}R. The point P P P\mathcal{P}P lies in the subset U U UUU of M M M\mathcal{M}M. The question then is how can we tell how f f fff assigns a real value to each point on M M M\mathcal{M}M, while we only have access to coordinates in R m R m R^(m)\mathbb{R}^{m}Rm.
The answer is shown in Fig. C.11. The homeomorphism ϕ ϕ phi\phiϕ takes P P P\mathcal{P}P to coordinate x μ = ϕ ( P ) x μ = ϕ ( P ) x^(mu)=phi(P)x^{\mu}=\phi(\mathcal{P})xμ=ϕ(P) and U U UUU to coordinate neighbourhood ϕ ( U ) R m ϕ ( U ) R m phi(U)subR^(m)\phi(U) \subset \mathbb{R}^{m}ϕ(U)Rm. This means we can write a composition f ϕ 1 : ϕ ( U ) R f ϕ 1 : ϕ ( U ) R f@phi^(-1):phi(U)rarrRf \circ \phi^{-1}: \phi(U) \rightarrow \mathbb{R}fϕ1:ϕ(U)R. In words, we input a coordinate x μ x μ x^(mu)x^{\mu}xμ in the region of R m R m R^(m)\mathbb{R}^{m}Rm called ϕ ( U ) ϕ ( U ) phi(U)\phi(U)ϕ(U) and output a point y 1 y 1 y^(1)y^{1}y1 on real line R R R\mathbb{R}R. This is all we ask of this function, which is a machine that takes multidimensional points and outputs a number. The message then is that what we usually call y = f ( x 1 , x 2 , , x m ) y = f x 1 , x 2 , , x m y=f(x^(1),x^(2),dots,x^(m))y=f\left(x^{1}, x^{2}, \ldots, x^{m}\right)y=f(x1,x2,,xm) should, when dealing with a function on a manifold that takes M M M\mathcal{M}M to R R R\mathbb{R}R, be regarded as
(C.7) y = f ϕ 1 ( x 1 , , x m ) (C.7) y = f ϕ 1 x 1 , , x m {:(C.7)y=f@phi^(-1)(x^(1),dots,x^(m)):}\begin{equation*} y=f \circ \phi^{-1}\left(x^{1}, \ldots, x^{m}\right) \tag{C.7} \end{equation*}(C.7)y=fϕ1(x1,,xm)
Another way of saying this is that f ϕ 1 ( x μ ) f ϕ 1 x μ f@phi^(-1)(x^(mu))f \circ \phi^{-1}\left(x^{\mu}\right)fϕ1(xμ) is the coordinate representation of the function.
Example C. 7
Consider a function that maps between two different m m mmm-dimensional manifolds f f fff : M N M N MrarrN\mathcal{M} \rightarrow \mathcal{N}MN as shown in Fig. C.12. That is, it takes a point P P P\mathcal{P}P in the m m mmm-dimensional manifold M M M\mathcal{M}M to a point f ( P ) f ( P ) f(P)f(\mathcal{P})f(P) on the m m mmm-dimensional manifold N N N\mathcal{N}N. Take a chart ( U , ϕ ) ( U , ϕ ) (U,phi)(U, \phi)(U,ϕ) on M M M\mathcal{M}M and a chart ( V , ψ ) ( V , ψ ) (V,psi)(V, \psi)(V,ψ) on N N N\mathcal{N}N. Take P P P\mathcal{P}P to be in U U UUU and f ( P ) f ( P ) f(P)f(\mathcal{P})f(P) to be in V V VVV. The function has a coordinate representation
(C.8) ( y 1 , , y m ) = ψ f ϕ 1 ( x 1 , , x m ) (C.8) y 1 , , y m = ψ f ϕ 1 x 1 , , x m {:(C.8)(y^(1),dots,y^(m))=psi@f@phi^(-1)(x^(1),dots,x^(m)):}\begin{equation*} \left(y^{1}, \ldots, y^{m}\right)=\psi \circ f \circ \phi^{-1}\left(x^{1}, \ldots, x^{m}\right) \tag{C.8} \end{equation*}(C.8)(y1,,ym)=ψfϕ1(x1,,xm)
that is, it's an m m mmm-tuple-valued function y μ = f ( x μ ) y μ = f x μ y^(mu)=f(x^(mu))y^{\mu}=f\left(x^{\mu}\right)yμ=f(xμ) as shown in Fig. C.12.

C. 7 Differentiation on the manifold

Let's consider a m-dimensional manifold M M M\mathcal{M}M and a function f : M R f : M R f:Mrarr Rf: \mathcal{M} \rightarrow Rf:MR. We differentiate functions by varying them with respect to coordinates x μ x μ x^(mu)x^{\mu}xμ. It might seem like defining differentiation on manifolds should just rely on identifying a function f f fff and a chart ( U , ϕ ) ( U , ϕ ) (U,phi)(U, \phi)(U,ϕ) to map onto R m R m R^(m)\mathbb{R}^{m}Rm. This would allow us to vary f ϕ 1 f ϕ 1 f@phi^(-1)f \circ \phi^{-1}fϕ1 with respect to coordinates x μ x μ x^(mu)x^{\mu}xμ giving partial derivatives like x ν ( f ϕ 1 ) x ν f ϕ 1 (del)/(delx^(nu))(f@phi^(-1))\frac{\partial}{\partial x^{\nu}}\left(f \circ \phi^{-1}\right)xν(fϕ1). It is, unfortunately, not quite that simple. However, the fix we need provides the manifold with a rich structure that makes the extra effort involved in defining differentiation more than worth it.
First, the problem: it would seem reasonable that f f fff should be differentiable if f ϕ 1 f ϕ 1 f@phi^(-1)f \circ \phi^{-1}fϕ1 is differentiable, but this is not the case. If ψ ψ psi\psiψ is another homeomorphism such that ψ : V R m ψ : V R m psi:V rarrR^(m)\psi: V \rightarrow \mathbb{R}^{m}ψ:VRm and U V 0 U V 0 U nn V!=0U \cap V \neq 0UV0 then it's not necessarily the case that f ψ 1 f ψ 1 f@psi^(-1)f \circ \psi^{-1}fψ1 is also differentiable. Since we can write that
(C.9) f ψ 1 = f ϕ 1 ( ϕ ψ 1 ) (C.9) f ψ 1 = f ϕ 1 ϕ ψ 1 {:(C.9)f@psi^(-1)=f@phi^(-1)@(phi@psi^(-1)):}\begin{equation*} f \circ \psi^{-1}=f \circ \phi^{-1} \circ\left(\phi \circ \psi^{-1}\right) \tag{C.9} \end{equation*}(C.9)fψ1=fϕ1(ϕψ1)
we need the coordinate transformation ϕ ψ 1 ϕ ψ 1 phi@psi^(-1)\phi \circ \psi^{-1}ϕψ1 to be differentiable in order that f ψ 1 f ψ 1 f@psi^(-1)f \circ \psi^{-1}fψ1 is also differentiable.
Therefore, in order to be able to differentiate on a manifold with charts ( U i , ϕ i ) U i , ϕ i (U_(i),phi_(i))\left(U_{i}, \phi_{i}\right)(Ui,ϕi), we need that if a pair of regions U i U i U_(i)U_{i}Ui and U j U j U_(j)U_{j}Uj overlaps such that U i U j U i U j U_(i)nnU_(j)!=O/U_{i} \cap U_{j} \neq \varnothingUiUj, then the map
(C.10) ϕ i ϕ j 1 (C.10) ϕ i ϕ j 1 {:(C.10)phi_(i)@phi_(j)^(-1):}\begin{equation*} \phi_{i} \circ \phi_{j}^{-1} \tag{C.10} \end{equation*}(C.10)ϕiϕj1
should be infinitely differentiable (a property denoted C C C^(oo)C^{\infty}C ). 8 8 ^(8){ }^{8}8 We call such homeomorphisms C C C^(oo)C^{\infty}C-related. The idea of compatible charts allows us to construct a maximal atlas, which is the atlas that contains every compatible C C C^(oo)C^{\infty}C-related chart. (This allows us to ensure that two equivalent spaces with different atlases aren't actually two different manifolds.) A differentiable manifold is then specified by the set M M M\mathcal{M}M and its (unique, maximal) atlas of C C C^(oo)C^{\infty}C-related charts { U } { U } {U}\{U\}{U}. Defined in this way, the differentiable manifold carries a significant amount of structure.

Example C. 8

Let's consider the simplest possible differentiable manifold. Take a manifold N N N\mathcal{N}N to be R R R\mathbb{R}R and a homeomorphism η : R R η : R R eta:RrarrR\eta: \mathbb{R} \rightarrow \mathbb{R}η:RR to be the identity x x x x x|->xx \mapsto xxx (or, more simply, η ( x ) = x ) η ( x ) = x ) eta(x)=x)\eta(x)=x)η(x)=x). The manifold N N N\mathcal{N}N taken with the maximal atlas that contains the identity is a differentiable manifold. This is because 9 9 ^(9){ }^{9}9 the identity η ( x ) = x η ( x ) = x eta(x)=x\eta(x)=xη(x)=x is indeed C C C^(oo)C^{\infty}C.
Having fixed up differentiation on a manifold in terms of its homeomorphisms, we can characterize a differentiable function between manifolds 10 10 ^(10){ }^{10}10 (referring to Fig. C. 12 again).
A function that maps between manifolds f : M N f : M N f:MrarrNf: \mathcal{M} \rightarrow \mathcal{N}f:MN is differentiable if for every coordinate system ( ϕ , U ) ( ϕ , U ) (phi,U)(\phi, U)(ϕ,U) in M M M\mathcal{M}M and ( ψ , V ) ( ψ , V ) (psi,V)(\psi, V)(ψ,V) in N N N\mathcal{N}N, the map ψ f ϕ 1 : R m R n ψ f ϕ 1 : R m R n psi@f@phi^(-1):R^(m)rarrR^(n)\psi \circ f \circ \phi^{-1}: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n}ψfϕ1:RmRn is differentiable.
We now have a concept of a differentiable manifold and a differentiable map between manifolds. If we further insist that the map ψ f ϕ 1 ψ f ϕ 1 psi@f@phi^(-1)\psi \circ f \circ \phi^{-1}ψfϕ1 is invertible (i.e. that there exists a map ϕ f 1 ψ 1 ϕ f 1 ψ 1 phi@f^(-1)@psi^(-1)\phi \circ f^{-1} \circ \psi^{-1}ϕf1ψ1 ), and that both y = ψ f ϕ 1 ( x ) y = ψ f ϕ 1 ( x ) y=psi@f@phi^(-1)(x)y=\psi \circ f \circ \phi^{-1}(x)y=ψfϕ1(x) and x = ϕ f 1 ψ 1 ( y ) x = ϕ f 1 ψ 1 ( y ) x=phi@f^(-1)@psi^(-1)(y)x=\phi \circ f^{-1} \circ \psi^{-1}(y)x=ϕf1ψ1(y) are C C C^(oo)C^{\infty}C, then M M M\mathcal{M}M is said to be diffeomorphic to N N N\mathcal{N}N, and the map f f fff is called a diffeomorphism. Because the map is invertible, the dimension of M M M\mathcal{M}M has to equal that of N N N\mathcal{N}N, i.e. m = n m = n m=nm=nm=n. Two diffeomorphic manifolds can be regarded as essentially the same manifold. 11 11 ^(11){ }^{11}11
Example C. 9
The map f : R R f : R R f:RrarrRf: \mathbb{R} \rightarrow \mathbb{R}f:RR given by f ( x ) = x f ( x ) = x f(x)=xf(x)=xf(x)=x is a diffeomorphism. However, the map g : R R g : R R g:RrarrRg: \mathbb{R} \rightarrow \mathbb{R}g:RR given by g ( x ) = x 2 g ( x ) = x 2 g(x)=x^(2)g(x)=x^{2}g(x)=x2 is not a diffeomorphism because it is not 1-1 (e.g. g ( 2 ) = 4 g ( 2 ) = 4 g(2)=4g(2)=4g(2)=4, but also g ( 2 ) = 4 g ( 2 ) = 4 g(-2)=4g(-2)=4g(2)=4 ). The map h : R R h : R R h:RrarrRh: \mathbb{R} \rightarrow \mathbb{R}h:RR given by h ( x ) = x 3 h ( x ) = x 3 h(x)=x^(3)h(x)=x^{3}h(x)=x3 is not a diffeomorphism either. Although h h hhh is a 1-1 map, its inverse h 1 ( x ) = x 1 / 3 h 1 ( x ) = x 1 / 3 h^(-1)(x)=x^(1//3)h^{-1}(x)=x^{1 / 3}h1(x)=x1/3 is not sufficiently smooth (i.e. C C C^(oo)C^{\infty}C ) at x = 0 x = 0 x=0x=0x=0, since its first derivative is not defined. The map ϕ : R 2 R 2 ϕ : R 2 R 2 phi:R^(2)rarrR^(2)\phi: \mathbb{R}^{2} \rightarrow \mathbb{R}^{2}ϕ:R2R2 given by ϕ ( x , y ) = ( x + y 2 , y x 2 ) ϕ ( x , y ) = x + y 2 , y x 2 phi(x,y)=(x+(y)/(2),y-(x)/(2))\phi(x, y)=\left(x+\frac{y}{2}, y-\frac{x}{2}\right)ϕ(x,y)=(x+y2,yx2) is a diffeomorphism; the determinant of the Jacobian of the map is non-zero everywhere 12 12 ^(12){ }^{12}12 and so the map is invertible.
8 8 ^(8){ }^{8}8 If f ( x 1 , , x n ) f x 1 , , x n f(x_(1),dots,x_(n))f\left(x_{1}, \ldots, x_{n}\right)f(x1,,xn) is a function defined on an open region S S SSS of R n R n R^(n)\mathbb{R}^{n}Rn, then it is differentiable of class C k C k C^(k)C^{k}Ck if all of the partial derivatives or order less than or equal to k k kkk exist and are continuous functions on S S SSS. A special case is the C C C^(oo)C^{\infty}C (or smooth) function: a map is C C C^(oo)C^{\infty}C if the coordinates of a point in N N N\mathcal{N}N are infinitely differentiable functions of the coordinates of the inverse image of the point M M M\mathcal{M}M. All polynomial functions are C C C^(oo)C^{\infty}C. By contrast, a function like x 1 3 x 1 3 x^((1)/(3))x^{\frac{1}{3}}x13 has a first derivative that is not continuous at the origin (where it blows up), so it is not C C C^(oo)C^{\infty}C.
9 9 ^(9){ }^{9}9 The identity map x x x x x|->xx \mapsto xxx, i.e. η ( x ) = η ( x ) = eta(x)=\eta(x)=η(x)= x x xxx, is continuous, its first derivative is unity, so is continuous. Subsequent derivatives are zero, which is continuous too.
10 10 ^(10){ }^{10}10 We assume manifold M M M\mathcal{M}M is m m mmm dimensional and manifold N N N\mathcal{N}N is n n nnn dimensional.
11 11 ^(11){ }^{11}11 A diffeomorphism can only apply to manifolds. This is because of the local smoothness of a manifold that follows from it resembling R n R n R^(n)\mathbb{R}^{n}Rn locally. In contrast, a homeomorphism can apply between things that aren't manifolds.
12 12 ^(12){ }^{12}12 The Jacobian matrix of the map is the matrix of partial derivatives ( ϕ i / x j ) ϕ i / x j (delphi^(i)//delx^(j))\left(\partial \phi^{i} / \partial x^{j}\right)(ϕi/xj), where in this case ϕ 1 = ϕ 1 = phi^(1)=\phi^{1}=ϕ1= x 1 + x 2 2 x 1 + x 2 2 x^(1)+(x^(2))/(2)x^{1}+\frac{x^{2}}{2}x1+x22 and ϕ 2 = x 2 x 1 2 ϕ 2 = x 2 x 1 2 phi^(2)=x^(2)-(x^(1))/(2)\phi^{2}=x^{2}-\frac{x^{1}}{2}ϕ2=x2x12. The determinant of this matrix is called the Jacobian, and in this case it is equal to 5 4 5 4 (5)/(4)\frac{5}{4}54 which, crucially, is non-zero.
13 13 ^(13){ }^{13}13 Beyond general relativity, diffeomorphisms are useful in mechanics where they allow an insightful geometrical description of Hamiltonian mechanics. description of Hamiltonian mechanics.
See Geroch's book Geometrical Quantum Mechanics for an introduction.
14 14 ^(14){ }^{14}14 The very important Killing vector fields that tell us about conserved quantities are therefore also most generally given in terms of diffeomorphisms.
15 15 ^(15){ }^{15}15 Recall that we had
( £ u g ) μ ν = u σ g α β ; σ + g α σ u σ ; β + g σ β u σ ; α , (C.12) or ( £ u g ) μ ν = 2 u ( α ; β ) . . £ u g μ ν = u σ g α β ; σ + g α σ u σ ; β + g σ β u σ ; α ,  (C.12)   or  £ u g μ ν = 2 u ( α ; β ) . . {:[(£_(u)g)_(mu nu)=u^(sigma)g_(alpha beta;sigma)+g_(alpha sigma)u^(sigma)_(;beta)],[+g_(sigma beta)u^(sigma)_(;alpha)","quad" (C.12) "],[" or "(£_(u)g)_(mu nu)=2u_((alpha;beta).).]:}\begin{aligned} \left(£_{\boldsymbol{u}} \boldsymbol{g}\right)_{\mu \nu}= & u^{\sigma} g_{\alpha \beta ; \sigma}+g_{\alpha \sigma} u^{\sigma}{ }_{; \beta} \\ & +g_{\sigma \beta} u^{\sigma}{ }_{; \alpha}, \quad \text { (C.12) } \\ \text { or }\left(£_{\boldsymbol{u}} \boldsymbol{g}\right)_{\mu \nu}= & 2 u_{(\alpha ; \beta) .} . \end{aligned}(£ug)μν=uσgαβ;σ+gασuσ;β+gσβuσ;α, (C.12)  or (£ug)μν=2u(α;β)..
Note that a homeomorphism is basically a diffeomorphism without the differentiability requirement. One way to think about it is that calling a map between two spaces a homeomorphism means that you can deform one space to the other continuously. Calling a map between two spaces a diffeomorphism tells you something extra; it means that it is possible to deform one space to the other smoothly, and that smoothness of the coordinate transformations is independent of the coordinates chosen.
Why are diffeomorphisms so essential for general relativity? A diffeomorphism ϕ : M M ϕ : M M phi:MrarrM\phi: \mathcal{M} \rightarrow \mathcal{M}ϕ:MM maps a point to another point in the same manifold. Such diffeomorphisms are analogous to active coordinate transformations that transform from one point to another point. If the physics is unaffected by this transformation then this tells us about points that are the same (or indistinguishable). Diffeomorphisms therefore reveal the gauge symmetries of general relativity. 13 13 ^(13){ }^{13}13 As a result, general relativity is sometimes called a diffeomorphism-invariant theory. Moreover, diffeomorphisms allow us to compare tensors defined at different points on a manifold and so the definition of the Lie derivative £ u £ u £_(u)£_{\boldsymbol{u}}£u (Chapter 33) is most generally given in terms of diffeomorphisms. 14 14 ^(14){ }^{14}14 In this case, the flow along the integral curves is represented by a diffeomorphism, where the vector field u u u\boldsymbol{u}u encoding this flow is referred to as the generator of the diffeomorphism. We use this in the next example.

Example C. 10

We can use invariance with respect to diffeomorphisms to justify one of the most fundamental equations in relativity: T = 0 T = 0 grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0T=0. Let's use the machinery of Chapter 40 and examine the variation of the matter Lagrangian with the components of the metric
(C.11) δ S = d 4 x δ ( g L m ) δ g μ ν δ g μ ν (C.11) δ S = d 4 x δ g L m δ g μ ν δ g μ ν {:(C.11)delta S=intd^(4)x(delta(sqrt(-g)L_(m)))/(deltag_(mu nu))deltag_(mu nu):}\begin{equation*} \delta S=\int \mathrm{d}^{4} x \frac{\delta\left(\sqrt{-g} \mathcal{L}_{\mathrm{m}}\right)}{\delta g_{\mu \nu}} \delta g_{\mu \nu} \tag{C.11} \end{equation*}(C.11)δS=d4xδ(gLm)δgμνδgμν
If the variations in the metric components are generated by diffeomorphisms, we have 15 δ g μ ν = ( £ u g ) μ ν = 2 u ( μ ; ν ) 15 δ g μ ν = £ u g μ ν = 2 u ( μ ; ν ) ^(15)deltag_(mu nu)=(£_(u)g)_(mu nu)=2u_((mu;nu)){ }^{15} \delta g_{\mu \nu}=\left(£_{\boldsymbol{u}} \boldsymbol{g}\right)_{\mu \nu}=2 u_{(\mu ; \nu)}15δgμν=(£ug)μν=2u(μ;ν). For a diffeomorphism-invariant theory we have δ S = 0 δ S = 0 delta S=0\delta S=0δS=0 and so, as a result, we write
(C.13) 0 = d 4 x δ ( g L m ) δ g μ ν u μ ; ν (C.13) 0 = d 4 x δ g L m δ g μ ν u μ ; ν {:(C.13)0=intd^(4)x(delta(sqrt(-g)L_(m)))/(deltag_(mu nu))u_(mu;nu):}\begin{equation*} 0=\int \mathrm{d}^{4} x \frac{\delta\left(\sqrt{-g} \mathcal{L}_{\mathrm{m}}\right)}{\delta g_{\mu \nu}} u_{\mu ; \nu} \tag{C.13} \end{equation*}(C.13)0=d4xδ(gLm)δgμνuμ;ν
where we drop the symmetrization of the covariant derivative of u u u\boldsymbol{u}u, as δ ( g L m ) / δ g μ ν δ g L m / δ g μ ν delta(sqrt(-g)L_(m))//deltag_(mu nu)\delta\left(\sqrt{-g} \mathcal{L}_{\mathrm{m}}\right) / \delta g_{\mu \nu}δ(gLm)/δgμν is symmetric, and so the integral is unaffected by the presence of the symmetrization. Integrating by parts, we find
(C.14) 0 = d 4 x g u ν μ [ 1 g δ ( g L m ) δ g μ ν ] (C.14) 0 = d 4 x g u ν μ 1 g δ g L m δ g μ ν {:(C.14)0=-intd^(4)xsqrt(-g)u_(nu)grad_(mu)[(1)/(sqrt(-g))(delta(sqrt(-g)L_(m)))/(deltag_(mu nu))]:}\begin{equation*} 0=-\int \mathrm{d}^{4} x \sqrt{-g} u_{\nu} \nabla_{\mu}\left[\frac{1}{\sqrt{-g}} \frac{\delta\left(\sqrt{-g} \mathcal{L}_{\mathrm{m}}\right)}{\delta g_{\mu \nu}}\right] \tag{C.14} \end{equation*}(C.14)0=d4xguνμ[1gδ(gLm)δgμν]
This must be true for diffeomorphisms generated by an arbitrary field u u u\boldsymbol{u}u and so, using the definition of T T T\boldsymbol{T}T in terms of the action from Chapter 40, i.e.
(C.15) T μ ν = 2 g δ ( g L m ) δ g μ ν (C.15) T μ ν = 2 g δ g L m δ g μ ν {:(C.15)T^(mu nu)=(2)/(sqrt(-g))(delta(sqrt(-g)L_(m)))/(deltag_(mu nu)):}\begin{equation*} T^{\mu \nu}=\frac{2}{\sqrt{-g}} \frac{\delta\left(\sqrt{-g} \mathcal{L}_{\mathrm{m}}\right)}{\delta g_{\mu \nu}} \tag{C.15} \end{equation*}(C.15)Tμν=2gδ(gLm)δgμν
we can see that the integrand is equivalent to ( μ T ) μ ν = T ; μ μ ν = 0 μ T μ ν = T ; μ μ ν = 0 (grad_(mu)T)^(mu nu)=T_(;mu)^(mu nu)=0\left(\boldsymbol{\nabla}_{\mu} \boldsymbol{T}\right)^{\mu \nu}=T_{; \mu}^{\mu \nu}=0(μT)μν=T;μμν=0 or T = 0 T = 0 grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0T=0.

C. 8 Compact regions

We have stated that a closed interval between points a a aaa and b b bbb on the real line R R R\mathbb{R}R is written [ a , b ] [ a , b ] [a,b][a, b][a,b]. It has the nice property that it encompasses a finite interval, doesn't have any holes in it, and includes its boundary. This notion can be generalized for a region of a manifold and the general term we will use is compact. If we say a region is compact, we mean that it doesn't do things like (i) go off to infinity; (ii) have bits removed; nor (iii) have bits of its boundary removed. One of the ways of achieving this is to insist that any sequence of points in our region must have a limit (or accumulation point) that also lies in the region. 16 16 ^(16){ }^{16}16
Example C. 11
A closed interval [ 0 , 1 ] [ 0 , 1 ] [0,1][0,1][0,1] on R R R\mathbb{R}R is compact, as can be seen by considering the sequence of points defined by 1 / n 1 / n 1//n1 / n1/n where n > 0 n > 0 n > 0n>0n>0 is a positive integer, each of which lies within that interval, but crucially so does its limit (the point reached when n n n rarr oon \rightarrow \inftyn, which is 0 and is a member of the set of points defined by [ 0 , 1 ] [ 0 , 1 ] [0,1][0,1][0,1] ). This would not work if we removed the point 0 on the boundary by considering ( 0,1 ] instead of [ 0 , 1 ] [ 0 , 1 ] [0,1][0,1][0,1]. We conclude that [ 0 , 1 ] [ 0 , 1 ] [0,1][0,1][0,1] is compact, but ( 0 , 1 ] ( 0 , 1 ] (0,1](0,1](0,1] is not compact.
We have defined compactness as an 'upgrade' of the notion of a closed interval [ a , b ] [ a , b ] [a,b][a, b][a,b] on R R R\mathbb{R}R, so it is not surprising that such a closed interval is compact, but this statement is termed the Heine-Borel theorem. 17 17 ^(17){ }^{17}17 In fact, it is also possible to show that a subset of real numbers is compact if and only if it is closed and bounded. 18 18 ^(18){ }^{18}18 A compact space can therefore be thought of as a space which, if it has a boundary, includes the boundary as part of the space, and it has no missing parts. Some examples of spaces that are and aren't compact are given below.

Example C. 12

  • The closed unit disc is compact, as is the sphere S 2 S 2 S^(2)S^{2}S2 and the torus T 2 T 2 T^(2)T^{2}T2.
  • The Euclidean plane is not compact (it contains points that run off to infinity). Neither is the open unit disc (points on its boundary are not included in the space). Nor is the closed disc with a hole in it (it has a missing region).

C. 9 Curves

A curve on a manifold can be described using a parametrization, with a real number λ λ lambda\lambdaλ telling us how far we are along the curve. Thus, if we take the map c : [ a , b ] M c : [ a , b ] M c:[a,b]rarrMc:[a, b] \rightarrow \mathcal{M}c:[a,b]M as shown, the real number λ λ lambda\lambdaλ on the real line R R R\mathbb{R}R is mapped on to the point c ( λ ) M c ( λ ) M c(lambda)inMc(\lambda) \in \mathcal{M}c(λ)M, producing a curve on M M M\mathcal{M}M as λ λ lambda\lambdaλ runs from a a aaa to b b bbb. As usual, the homeomorphism ϕ ϕ phi\phiϕ maps from
16 16 ^(16){ }^{16}16 An alternative, and rather grand and formal, way of defining a compact region is to say that a region A A A\mathcal{A}A is compact if every open cover O O OOO contains a finite sub-collection of open sets which also cover A A A\mathcal{A}A.
17 17 ^(17){ }^{17}17 We state this theorem without proof here, but see e.g. Spivak's Calculus on Manifolds for the full story on this and other theorems about topological spaces. The theorem is named in honour of the German mathematician Eduard Heine (1821-1881) and the French mathematician Émile Borel (1871-1956).
18 18 ^(18){ }^{18}18 The term bounded means that the set doesn't run off to infinity but is enclosed within some finite region; to define a region of a manifold as bounded requires a notion of distance, i.e. it applies only to spaces endowed with a metric.
Fig. C. 13 A curve c ( λ ) c ( λ ) c(lambda)c(\lambda)c(λ), expressed in R m R m R^(m)\mathbb{R}^{m}Rm by a homeomorphism ϕ ϕ phi\phiϕ.
19 19 ^(19){ }^{19}19 This is the idea of a fibre bundle, examined in more detail in the next section.
Fig. C. 14 A vector field in terms of mappings.
M M M\mathcal{M}M to a chart in R m R m R^(m)\mathbb{R}^{m}Rm. Therefore, the coordinate on the curve on M M M\mathcal{M}M corresponding to some value of λ λ lambda\lambdaλ is given by the composite map
(C.16) x μ = ϕ c ( λ ) . (C.16) x μ = ϕ c ( λ ) . {:(C.16)x^(mu)=phi@c(lambda).:}\begin{equation*} x^{\mu}=\phi \circ c(\lambda) . \tag{C.16} \end{equation*}(C.16)xμ=ϕc(λ).
In short, input at parameter λ λ lambda\lambdaλ and output a point on the curve ( x 1 , x m ) = ϕ c ( λ ) x 1 , x m = ϕ c ( λ ) (x^(1),dotsx^(m))=phi@c(lambda)\left(x^{1}, \ldots x^{m}\right)=\phi \circ c(\lambda)(x1,xm)=ϕc(λ) in R m R m R^(m)\mathbb{R}^{m}Rm (see Fig. C.13).

C. 10 Tangent spaces

Next, we build up to the notion of a vector. An arrow does not properly represent a vector on a manifold. There's no origin or concept of straightness, after all. The vectors we shall discuss are tangent vectors, that is, directional derivatives to curves that live in a manifold. A tangent vector does not live in the same manifold in which curves live, but instead lives in a manifold called a tangent space. There is not one tangent space but many: one, in fact, for each point on the manifold. The tangent spaces can be thought of as floating above the manifold. 19 19 ^(19){ }^{19}19
We shall generalize the procedure of finding a tangent vector as the directional derivative along a curve by declaring:
The tangent vector at c ( λ = 0 ) c ( λ = 0 ) c(lambda=0)c(\lambda=0)c(λ=0) is defined as the directional derivative of a function f ( c ( λ ) ) f ( c ( λ ) ) f(c(lambda))f(c(\lambda))f(c(λ)) along the curve c ( λ ) c ( λ ) c(lambda)c(\lambda)c(λ), evaluated at λ = 0 λ = 0 lambda=0\lambda=0λ=0.
A curve c ( λ ) c ( λ ) c(lambda)c(\lambda)c(λ) takes the parameter λ λ lambda\lambdaλ from R R R\mathbb{R}R and puts it on the manifold. To take it from the manifold back to R R R\mathbb{R}R we need a function f : M R f : M R f:MrarrRf: \mathcal{M} \rightarrow \mathbb{R}f:MR. Once we have this, we can evaluate the rate of change of the curve with the parameter λ λ lambda\lambdaλ as follows:
( Rate of change of f ( c ( λ ) ) along the curve, evaluated at λ = 0 ) = d f ( c ( λ ) ) d λ | λ = 0 = d ( f c ) d λ | λ = 0 (  Rate of change of  f ( c ( λ ) )  along   the curve, evaluated at  λ = 0 ) = d f ( c ( λ ) ) d λ λ = 0 = d ( f c ) d λ λ = 0 ((" Rate of change of "f(c(lambda))" along ")/(" the curve, evaluated at "lambda=0))=(df(c(lambda)))/(dlambda)|_(lambda=0)=(d(f@c))/(dlambda)|_(lambda=0)\binom{\text { Rate of change of } f(c(\lambda)) \text { along }}{\text { the curve, evaluated at } \lambda=0}=\left.\frac{\mathrm{d} f(c(\lambda))}{\mathrm{d} \lambda}\right|_{\lambda=0}=\left.\frac{\mathrm{d}(f \circ c)}{\mathrm{d} \lambda}\right|_{\lambda=0}( Rate of change of f(c(λ)) along  the curve, evaluated at λ=0)=df(c(λ))dλ|λ=0=d(fc)dλ|λ=0,
where, in the last part, we've written out the composition.
Using the homeomorphism ϕ : M R m ϕ : M R m phi:MrarrR^(m)\phi: \mathcal{M} \rightarrow \mathbb{R}^{m}ϕ:MRm, we can map the point on the manifold into R m R m R^(m)\mathbb{R}^{m}Rm, as shown in Fig. C.14, and then the combination f c f c f@cf \circ cfc can be written as
(C.18) f c = ( f ϕ 1 ) ( ϕ c ) , (C.18) f c = f ϕ 1 ( ϕ c ) , {:(C.18)f@c=(f@phi^(-1))@(phi@c)",":}\begin{equation*} f \circ c=\left(f \circ \phi^{-1}\right) \circ(\phi \circ c), \tag{C.18} \end{equation*}(C.18)fc=(fϕ1)(ϕc),
where the first bracket ( f ϕ 1 ) f ϕ 1 (f@phi^(-1))\left(f \circ \phi^{-1}\right)(fϕ1) is a real-valued function of a point in R m R m R^(m)\mathbb{R}^{m}Rm [that is f ϕ 1 = f ( x μ ) ] f ϕ 1 = f x μ {:f@phi^(-1)=f(x^(mu))]\left.f \circ \phi^{-1}=f\left(x^{\mu}\right)\right]fϕ1=f(xμ)] and the second bracket ( ϕ c ) ( ϕ c ) (phi@c)(\phi \circ c)(ϕc) takes a point λ λ lambda\lambdaλ from R R R\mathbb{R}R and returns a point in R m R m R^(m)\mathbb{R}^{m}Rm [that is, it maps out the curve in coordinate space and could therefore be written as x μ ( c ( λ ) ) ] x μ ( c ( λ ) ) {:x^(mu)(c(lambda))]\left.x^{\mu}(c(\lambda))\right]xμ(c(λ))]. Since f ϕ 1 f ϕ 1 f@phi^(-1)f \circ \phi^{-1}fϕ1 and ϕ c ϕ c phi@c\phi \circ cϕc are coordinate representations of the function and curve respectively, we can write the derivative in terms of a Leibniz (or chain) rule
d ( f c ) d λ = x μ ( f ϕ 1 ) d d λ ( ϕ c ( λ ) ) (C.19) = f ( x μ ) x μ d x μ ( c ( λ ) ) d λ | λ = 0 . d ( f c ) d λ = x μ f ϕ 1 d d λ ( ϕ c ( λ ) ) (C.19) = f x μ x μ d x μ ( c ( λ ) ) d λ λ = 0 . {:[(d(f@c))/(dlambda)=(del)/(delx^(mu))(f@phi^(-1))(d)/((d)lambda)(phi@c(lambda))],[(C.19)=(del f(x^(mu)))/(delx^(mu))(dx^(mu)(c(lambda)))/(dlambda)|_(lambda=0).]:}\begin{align*} \frac{\mathrm{d}(f \circ c)}{\mathrm{d} \lambda} & =\frac{\partial}{\partial x^{\mu}}\left(f \circ \phi^{-1}\right) \frac{\mathrm{d}}{\mathrm{~d} \lambda}(\phi \circ c(\lambda)) \\ & =\left.\frac{\partial f\left(x^{\mu}\right)}{\partial x^{\mu}} \frac{\mathrm{d} x^{\mu}(c(\lambda))}{\mathrm{d} \lambda}\right|_{\lambda=0} . \tag{C.19} \end{align*}d(fc)dλ=xμ(fϕ1)d dλ(ϕc(λ))(C.19)=f(xμ)xμdxμ(c(λ))dλ|λ=0.
It is this expression that we use to define a vector v [ f ] v [ f ] v[f]\boldsymbol{v}[f]v[f]. It features the differential operator x μ x μ (del)/(delx^(mu))\frac{\partial}{\partial x^{\mu}}xμ, which supplies the basis vectors, acting on a function f f fff. It comes with a factor d x μ ( c ( λ ) ) d λ | λ = 0 d x μ ( c ( λ ) ) d λ λ = 0 (dx^(mu)(c(lambda)))/(dlambda)|_(lambda=0)\left.\frac{\mathrm{d} x^{\mu}(c(\lambda))}{\mathrm{d} \lambda}\right|_{\lambda=0}dxμ(c(λ))dλ|λ=0, that we call the μ μ mu\muμ th component and which tells us about the rate of change of the curve c c ccc with λ λ lambda\lambdaλ, when it is projected into R m R m R^(m)\mathbb{R}^{m}Rm. As a result, the tangent vector v v v\boldsymbol{v}v is defined as
(C.20) v [ f ] = d x μ ( c ( λ ) ) d λ | λ = 0 f x μ (C.20) v [ f ] = d x μ ( c ( λ ) ) d λ λ = 0 f x μ {:(C.20)v[f]=(dx^(mu)(c(lambda)))/(dlambda)|_(lambda=0)(del f)/(delx^(mu)):}\begin{equation*} \boldsymbol{v}[f]=\left.\frac{\mathrm{d} x^{\mu}(c(\lambda))}{\mathrm{d} \lambda}\right|_{\lambda=0} \frac{\partial f}{\partial x^{\mu}} \tag{C.20} \end{equation*}(C.20)v[f]=dxμ(c(λ))dλ|λ=0fxμ
where the order of the terms on the right-hand side has been swapped to conform to our usual convention of writing v = v μ e μ v = v μ e μ v=v^(mu)e_(mu)\boldsymbol{v}=v^{\mu} \boldsymbol{e}_{\mu}v=vμeμ.
Example C. 13
The simplest example results if we let the function f f fff be the coordinate function x ν = ϕ ν c ( λ ) x ν = ϕ ν c ( λ ) x^(nu)=phi^(nu)@c(lambda)x^{\nu}=\phi^{\nu} \circ c(\lambda)xν=ϕνc(λ) or
(C.21) ϕ ν c ( λ ) = ϕ ν ( c ( λ ) ) = x ν ( λ ) (C.21) ϕ ν c ( λ ) = ϕ ν ( c ( λ ) ) = x ν ( λ ) {:(C.21)phi^(nu)@c(lambda)=phi^(nu)(c(lambda))=x^(nu)(lambda):}\begin{equation*} \phi^{\nu} \circ c(\lambda)=\phi^{\nu}(c(\lambda))=x^{\nu}(\lambda) \tag{C.21} \end{equation*}(C.21)ϕνc(λ)=ϕν(c(λ))=xν(λ)
The action of the vector on x ν ( λ ) x ν ( λ ) x^(nu)(lambda)x^{\nu}(\lambda)xν(λ) is
(C.22) v [ x ν ] = d x μ d λ x ν ( λ ) x μ = d x ν ( λ ) d λ | λ = 0 (C.22) v x ν = d x μ d λ x ν ( λ ) x μ = d x ν ( λ ) d λ λ = 0 {:(C.22)v[x^(nu)]=(dx^(mu))/(dlambda)(delx^(nu)(lambda))/(delx^(mu))=(dx^(nu)(lambda))/(dlambda)|_(lambda=0):}\begin{equation*} \boldsymbol{v}\left[x^{\nu}\right]=\frac{\mathrm{d} x^{\mu}}{\mathrm{d} \lambda} \frac{\partial x^{\nu}(\lambda)}{\partial x^{\mu}}=\left.\frac{\mathrm{d} x^{\nu}(\lambda)}{\mathrm{d} \lambda}\right|_{\lambda=0} \tag{C.22} \end{equation*}(C.22)v[xν]=dxμdλxν(λ)xμ=dxν(λ)dλ|λ=0
thereby measuring the rate of change of the coordinate component x ν x ν x^(nu)x^{\nu}xν with λ λ lambda\lambdaλ.
We can generalize and say that a tangent vector works on a number of functions (e.g. f f fff and g g ggg ) taken from the set F = C ( M ) F = C ( M ) F=C^(oo)(M)\mathcal{F}=C^{\infty}(\mathcal{M})F=C(M) of all smooth functions from M M M\mathcal{M}M to R R R\mathbb{R}R (or f , g F f , g F f,g inFf, g \in \mathcal{F}f,gF ). We may then firm up the definition a little at this point and say that a tangent vector v v v\boldsymbol{v}v at a point P M P M PinM\mathcal{P} \in \mathcal{M}PM is a map v : F R v : F R v:Frarr R\boldsymbol{v}: \mathcal{F} \rightarrow Rv:FR, which is (i) linear and (ii) obeys the Leibniz rule, or
(i) v ( α f + β g ) = α v ( f ) + β v ( g ) , (ii) v ( f g ) = f v ( g ) + g v ( f ) ,  (i)  v ( α f + β g ) = α v ( f ) + β v ( g ) ,  (ii)  v ( f g ) = f v ( g ) + g v ( f ) , {:" (i) "quad{:[v(alpha f+beta g)=alpha v(f)+beta v(g)","],[" (ii) "v(fg)]:}=fv(g)+gv(f)",":}\begin{aligned} & \text { (i) } \quad \begin{aligned} \boldsymbol{v}(\alpha f+\beta g) & =\alpha \boldsymbol{v}(f)+\beta \boldsymbol{v}(g), \\ \text { (ii) } & \boldsymbol{v}(f g) \end{aligned}=\boldsymbol{f} \boldsymbol{v}(g)+g \boldsymbol{v}(f), \end{aligned} (i) v(αf+βg)=αv(f)+βv(g), (ii) v(fg)=fv(g)+gv(f),
where α α alpha\alphaα and β β beta\betaβ are arbitrary numbers and the functions are evaluated at the point P P P\mathcal{P}P. In this way of viewing tangent vectors, there is a one-to-one correspondence between vectors and derivatives. It is consistent, therefore, to adopt the view that instead of vectors, we can work with derivatives.
There is a class of curves that all pass through P P P\mathcal{P}P. If they all have the same tangent vector, then we can identify them. If we have
(C.23) c 1 ( λ = 0 ) = c 2 ( λ = 0 ) = P (C.23) c 1 ( λ = 0 ) = c 2 ( λ = 0 ) = P {:(C.23)c_(1)(lambda=0)=c_(2)(lambda=0)=P:}\begin{equation*} c_{1}(\lambda=0)=c_{2}(\lambda=0)=\mathcal{P} \tag{C.23} \end{equation*}(C.23)c1(λ=0)=c2(λ=0)=P
and
(C.24) d x μ ( c 1 ( λ ) ) d λ | λ = 0 = d x μ ( c 2 ( λ ) ) d λ | λ = 0 , (C.24) d x μ c 1 ( λ ) d λ λ = 0 = d x μ c 2 ( λ ) d λ λ = 0 , {:(C.24)(dx^(mu)(c_(1)(lambda)))/(dlambda)|_(lambda=0)=(dx^(mu)(c_(2)(lambda)))/(dlambda)|_(lambda=0)",":}\begin{equation*} \left.\frac{\mathrm{d} x^{\mu}\left(c_{1}(\lambda)\right)}{\mathrm{d} \lambda}\right|_{\lambda=0}=\left.\frac{\mathrm{d} x^{\mu}\left(c_{2}(\lambda)\right)}{\mathrm{d} \lambda}\right|_{\lambda=0}, \tag{C.24} \end{equation*}(C.24)dxμ(c1(λ))dλ|λ=0=dxμ(c2(λ))dλ|λ=0,
then these give the same vector component at P P P\mathcal{P}P. We identify the tangent vector v v v\boldsymbol{v}v with the equivalence class of curves, rather than a single curve.
Lots of curves passing through P P P\mathcal{P}P will have different tangent vectors. All of the tangent vectors at P P P\mathcal{P}P, one for each class of curves, form a vector space called a tangent space of the manifold M M M\mathcal{M}M at point P P P\mathcal{P}P, denoted T P M T P M T_(P)M\mathcal{T}_{\mathcal{P}} \mathcal{M}TPM. This is shown in Fig. C.15. We examine the generalization of the tangent space in the next section.
(a)

Fig. C. 15 The tangent space contains all of the tangent vectors. (a) A set of vectors in the tangent space at a point P P P\mathcal{P}P. (b) The point of view used earlier in the book of a tangent plane to a surface relies on embedding the manifold in Euclidean space.
Fig. C. 16 (a) The one-dimensional manifold with some tangent vectors identified at each point. (b) The tangent vectors represented as vertical lines. These are the fibres we use to form a bundle.
Fig. C. 17 A fibre bundle B B B\mathcal{B}B consisting of base space M M M\mathcal{M}M and fibres V V V\mathcal{V}V. The projection π π pi\piπ collapses a fibre down to a point.
Fig. C. 18 A fibre bundle and its projection π π pi\piπ.

C. 11 Fibre bundles

Consider a curve, which is itself a one-dimensional manifold M M M\mathcal{M}M. Take tangents at each point along the curve [some examples are shown in Fig. C.16(a)]. Drawn in this way, neighbouring tangent vectors intersect each other creating a certain amount of confusion. To avoid this, we could instead draw them as in Fig. C.16(b), rotating them around so they no longer keep bumping into each other. Of course, in this new drawing they no longer so obviously resemble tangent vectors, but let's imagine that we could somehow still encode that information in them. Now lift the tangent vectors for each point off the line so that we have the state of affairs shown in Fig. C.17. This shows the one-dimensional manifold M M M\mathcal{M}M. Above each and every point in M M M\mathcal{M}M there is one (one-dimensional) manifold V V V\mathcal{V}V floating above it. The particular tangent vector at a point in M M M\mathcal{M}M is represented by a point on the particular manifold V V V\mathcal{V}V (like beads on an abacus). We call the manifold V V V\mathcal{V}V a fibre. By combining the manifold M M M\mathcal{M}M (the points) and all of the V V V\mathcal{V}V s (the fibres or tangent spaces) we obtain a new, two-dimensional manifold V M V M VM\mathcal{V} \mathcal{M}VM known as a fibre bundle B B B\mathcal{B}B or, in this special case of tangent spaces, a tangent bundle T M T M TM\mathcal{T} \mathcal{M}TM.
More generally, then, a fibre bundle B B B\mathcal{B}B is a manifold defined in terms of two other manifolds M M M\mathcal{M}M and V V V\mathcal{V}V. Manifold M M M\mathcal{M}M is called the base space and V V V\mathcal{V}V is called the fibre. The dimension of the fibre bundle B B B\mathcal{B}B is always the sum of the dimensions of M M M\mathcal{M}M and V V V\mathcal{V}V. There are many copies of V V V\mathcal{V}V in B B B\mathcal{B}B. One complete copy of V V V\mathcal{V}V stands above each point in M M M\mathcal{M}M.
We can undo the previous construction if we define a continuous map from B B B\mathcal{B}B back down on to M M M\mathcal{M}M. This is called the canonical projection π π pi\piπ from B B B\mathcal{B}B to M M M\mathcal{M}M. This map collapses each fibre down to the corresponding point in M M M\mathcal{M}M. If you like, it forgets about the information stored in the fibre, remembering only the point on M M M\mathcal{M}M which the fibre was floating above.
The simplest example of a bundle is a product space of M M M\mathcal{M}M with V V V\mathcal{V}V, which is written as M × V M × V MxxV\mathcal{M} \times \mathcal{V}M×V. The points in M × V M × V MxxV\mathcal{M} \times \mathcal{V}M×V are pairs of elements ( a , b a , b a,ba, ba,b ) where a a aaa belongs to M M M\mathcal{M}M and b b bbb belongs to V V V\mathcal{V}V (see Fig. C.18). This is often called a trivial bundle.
Example C. 14
Perhaps the simplest of all tangent bundles is formed by making the base manifold the unit circle. We shall consider T S 1 T S 1 TS^(1)\mathcal{T} S^{1}TS1, the tangent bundle of the circle S 1 S 1 S^(1)S^{1}S1 and its tangent vectors. The tangent bundle T S 1 T S 1 TS^(1)\mathcal{T} S^{1}TS1 is identical to the product space S 1 × R S 1 × R S^(1)xxRS^{1} \times \mathbb{R}S1×R, shown by the cylinder in Fig. C. 19 (and so is a trivial bundle). Moreover, T S 1 T S 1 TS^(1)\mathcal{T} S^{1}TS1 is a two-dimensional manifold which we can cover with coordinates. In S 1 S 1 S^(1)S^{1}S1 a point is described by a coordinate θ θ theta\thetaθ. A tangent vector v v v\boldsymbol{v}v at any point P P P\mathcal{P}P can be written as v = y θ y e θ v = y θ y e θ v=y(del)/(del theta)-=ye_(theta)\boldsymbol{v}=y \frac{\partial}{\partial \theta} \equiv y \boldsymbol{e}_{\theta}v=yθyeθ, where y y yyy is a coordinate in T θ T θ T_(theta)\mathcal{T}_{\theta}Tθ (i.e. an amplitude taken from the tangent space floating above the particular angle θ θ theta\thetaθ ). The coordinates ( θ , y ) ( θ , y ) (theta,y)(\theta, y)(θ,y) then tell us about the position on the base space and the coordinate along the fibre.
Bundles are said to be locally trivial if they are formed from a product space. We can ask whether they're also globally trivial: whether the
whole bundle can be represented by a product M × V M × V MxxV\mathcal{M} \times \mathcal{V}M×V. The example in Fig. C.19, where the bundle resembles a cylinder, is indeed globally trivial. An interesting counterexample of a bundle that is not globally trivial is a twisted bundle. This resembles M × V M × V MxxV\mathcal{M} \times \mathcal{V}M×V locally, but as we move around M M M\mathcal{M}M, the fibres twist, so that globally B B B\mathcal{B}B is different from M × V M × V MxxV\mathcal{M} \times \mathcal{V}M×V.

Example C. 15

Once again, take M M M\mathcal{M}M to be the circle S 1 S 1 S^(1)S^{1}S1 and V V V\mathcal{V}V to be the real line R R R\mathbb{R}R. The trivial bundle simply resembles a two-dimensional cylinder. Now construct a twisted bundle, by forming the fibres into a Möbius strip 20 20 ^(20){ }^{20}20 (see Fig. C.20). Locally, this is the same as the cylinder; globally it is not. To see the local similarity, remove a point P P P\mathcal{P}P from the base space. We then have a segment S 1 P S 1 P S^(1)-PS^{1}-\mathcal{P}S1P and the bundle above this segment can be deformed to look the same as the cylinder. It is only when we look at the whole of the base space that we notice the difference. To see this, consider two segments: S 1 P S 1 P S^(1)-PS^{1}-\mathcal{P}S1P and S 1 Q S 1 Q S^(1)-QS^{1}-\mathcal{Q}S1Q, where P P P\mathcal{P}P and Q Q Q\mathcal{Q}Q are different points. Each is locally trivial (i.e. each can be deformed to look like the cylinder). However, on gluing them together (with a twist) to make a whole we form the Möbius strip.
Formally, we characterize a bundle by looking at its cross section. The cross section of the bundle B B B\mathcal{B}B is a continuous image of the base space M M M\mathcal{M}M in B B B\mathcal{B}B, which meets each fibre at a single point (Fig. C.21). This is called the lift of the base space into the bundle. So if we apply the lift via a continuous function s : M B s : M B s:MrarrBs: \mathcal{M} \rightarrow \mathcal{B}s:MB, followed by the projection π : B M π : B M pi:BrarrM\pi: \mathcal{B} \rightarrow \mathcal{M}π:BM, we get the identity map from M M M\mathcal{M}M into itself
(C.25) π s = I (C.25) π s = I {:(C.25)pi@s=I:}\begin{equation*} \pi \circ s=I \tag{C.25} \end{equation*}(C.25)πs=I
For the trivial bundle M × V M × V MxxV\mathcal{M} \times \mathcal{V}M×V the cross sections look like continuous functions on M M M\mathcal{M}M which take values in the space V V V\mathcal{V}V. So a cross section of M × V M × V MxxV\mathcal{M} \times \mathcal{V}M×V assigns in a continuous way, a point of V V V\mathcal{V}V to each point on M M M\mathcal{M}M. This is like an extension of the ordinary idea of a graph of a function.

Example C. 16

Consider the cylindrical product bundle M × V M × V MxxV\mathcal{M} \times \mathcal{V}M×V. The cross section looks like a curve intersecting each fibre once as it goes around the cylinder. We can also mark out the curve featuring at the zeros of the vectors, called the zero section. There's no guarantee that any old curve intersects this line of zeros. The Möbius bundle is more complicated, but one thing that can be said is that a curve has to cross the zero section. This gives us a way of characterizing the difference between the Möbius strip and trivial bundle.
Fig. C. 19 The bundle T S 1 T S 1 TS^(1)\mathcal{T} S^{1}TS1 floating above its base space.
20 20 ^(20){ }^{20}20 August Ferdinand Möbius (17901868). The Möbius strip (a twodimensional surface that, when embedded in three dimensions, has only one side) was discovered by Möbius and, independently, by Johann Benedict Listing.
(a)

(b)
Fig. C. 20 (a) The globally trivial bundle. (b) The bundle with a twist, forming a Möbius strip.
Fig. C. 21 The cross section of the bundle B B B\mathcal{B}B, formed from the lift s s sss of the base space M M M\mathcal{M}M. This can be thought of as a way of graphing a function on M M M\mathcal{M}M in B B B\mathcal{B}B.

Chapter summary

  • The set M M M\mathcal{M}M is a manifold if each point in M M M\mathcal{M}M has an open neighbourhood which has a continuous 1-1 map onto an open set of R n R n R^(n)\mathbb{R}^{n}Rn for some n n nnn.
  • Morphisms allow us to map between manifolds. A diffeomorphism relates two manifolds which are endowed with a differentiable structure (meaning that they are smooth) and is the most useful morphism in general relativity. A diffeomorphism that maps a manifold onto itself is equivalent to an active coordinate transformation.
  • Compact regions don't go off to infinity, or have parts removed or on a boundary.
  • Vectors can be defined in terms of derivatives using curves on the manifold and mappings using the expression
(C.26) v [ f ] = d ( f c ) d λ . (C.26) v [ f ] = d ( f c ) d λ . {:(C.26)v[f]=(d(f@c))/(dlambda).:}\begin{equation*} \boldsymbol{v}[f]=\frac{\mathrm{d}(f \circ c)}{\mathrm{d} \lambda} . \tag{C.26} \end{equation*}(C.26)v[f]=d(fc)dλ.
  • A fibre bundle B B B\mathcal{B}B is the combination of a base space M M M\mathcal{M}M and a fibre space V V V\mathcal{V}V, with a fibre defined at each point in the base space.